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Preface 


In 1978 1 wrote an introductory textbook on general relativity and cos¬ 
mology, based on my lectures delivered to university audiences. The 
book was well received and had been in use for about 15-20 years until 
it went out of print. The present book has been written in response to 
requests from students as well as teachers of relativity who have missed 
the earlier text. 

An Introduction to Relativity is therefore a fresh rewrite of the 1978 
text, updated and perhaps a little enlarged. As I did for the earlier text, I 
have adopted a simple style, keeping in view a mathematics or physics 
undergraduate as the prospective reader. The topics covered are what I 
consider as essential features of the theory of relativity that a beginner 
ought to know. A more advanced text would be more exhaustive. 1 have 
come across texts whose formal and rigorous style or enormous size 
have been off-putting to a student wishing to know the A, B, C of the 
subject. 

Thus I offer no apology to a critic who may find the book lacking 
in some of his/her favourite topics. I am sure the readers of this book 
will be in a position to read and appreciate those topics after they have 
completed this preliminary introduction. 

Cambridge University Press published my book An Introduction to 
Cosmology, which was written with a similar view and has been well 
received. Although the present book contains chapters on cosmology, 
they are necessarily brief and highlight the role of general relativity. The 
reader may find it useful to treat the cosmology volume as a companion 
volume. Indeed, in a few places in this text he/she is directed to this 
companion volume for further details. 

It is a pleasure to acknowledge the encouragement received from 
Simon Mitton for writing this book. I also thank Vince Higgs, Lindsay 
Barnes, Laura Clark and their colleagues at Cambridge University Press 
for their advice and assistance in preparing the manuscript for publica¬ 
tion. Help received from my colleagues in Pune, Prem Kumar for fig¬ 
ures, Samir Dhurde and Arvind Paranjpye for images and Vyankatesh 


vii 



Preface 


Samak for the typescript, has been invaluable. I do hope that teach¬ 
ers and students of relativity will appreciate this rather unpretentious 
offering! 


Jayant V Narlikar 
IUCAA, Pune, India 



Chapter 1 

The special theory of relativity 


1.1 Historical background 

1905 is often described as Einstein’s annus mirabilis : a wonderful year 
in which he came up with three remarkable ideas. These were the Brow¬ 
nian motion in fluids, the photoelectric effect and the special theory of 
relativity. Each of these was of a basic nature and also had a wide impact 
on physics. In this chapter we will be concerned with special relativity, 
which was arguably the most fundamental of the above three ideas. 

It is perhaps a remarkable circumstance that, ever since the initia¬ 
tion of modern science with the works of Galileo, Kepler and Newton, 
there has emerged a feeling towards the end of each century that the end 
of physics is near: that is, most in-depth fundamental discoveries have 
been made and only detailed ‘scratching at the surface’ remains. This 
feeling emerged towards the end of the eighteenth century, when Newto¬ 
nian laws of motion and gravitation, the studies in optics and acoustics, 
etc. had provided explanations of most observed phenomena. The nine¬ 
teenth century saw the development of thermodynamics, the growth in 
understanding of electrodynamics, wave motion, etc., none of which had 
been expected in the previous century. So the feeling again grew that 
the end of physics was nigh. As we know, the twentieth century saw the 
emergence of two theories, fundamental but totally unexpected by the 
stalwarts of the nineteenth century, viz., relativity and quantum theory. 
Finally, the success of the attempt to unify electromagnetism with the 
weak interaction led many twentieth-century physicists to announce that 
the end of physics was not far off. That hope has not materialized even 
though the twenty-first century has begun. 
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While the above feeling of euphoria comes from the successes of 
the existing paradigm, the real hope of progress lies in those phenomena 
that seem anomalous, i.e., those that cannot be explained by the current 
paradigm. We begin our account with the notion of ‘ether’ or ‘aether’ 
(the extra ‘a’ for distinguishing the substance from the commonly used 
chemical fluid). Although Newton had (wrongly) resisted the notion 
that light travels as a wave, during the nineteenth century the concept 
of light travelling as a wave had become experimentally established 
through such phenomena as interference, diffraction and polarization. 
However, this understanding raised the next question: in what medium 
do these waves travel? For, conditioned by the mechanistic thinking of 
the Newtonian paradigm, physicists needed a medium whose distur¬ 
bance would lead to the wave phenomenon. Water waves travel in water, 
sound waves propagate in a fluid, elastic waves move through an elastic 
substance ... so light waves also need a medium called aether in which to 
travel. 

The fact that light seemed to propagate through almost a vacuum 
suggested that the proposed medium must be extremely ‘non-intrusive’ 
and so difficult to detect. Indeed, many unsuccessful attempts were 
made to detect it. The most important such experiment was conducted 
by Michelson and Morley. 


1.2 The Michelson and Morley experiment 

The basic idea behind the experiment conducted by A. A. Michelson 
and E. Morley in 1887 can be understood by invoking the example of a 
person rowing a boat in a river. Figure 1.1 shows a schematic diagram of 
a river flowing from left to right with speed i>. A boatman who can row 
his boat at speed c in still water is trying to row along and across the river 
in different directions. In Figure 1.1(a) he rows in the direction of the 
current and finds that his net speed in that direction is c + v. Likewise 
(see Figure 1.1 (b)), when he rows in the opposite direction his net speed 
is reduced to c — v. What is his speed when he rows across the river in 
the perpendicular direction as shown in Figure 1.1(c)? Clearly he must 
row in an oblique direction so that his velocity has a component v in a 
direction opposite to the current. This will compensate for the flow of 
the river. The remaining component Vc 2 — v 2 will take him across the 
river in a perpendicular direction as shown in Figure 1.1(c). 

Suppose now that he does this experiment of rowing down the river 
a distance d and back the same distance and then rows the same distance 
perpendicular to the current and back. What is the difference of time r 
between the two round trips? The above details lead to the answer that 
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(a) 


Fig. 1.1. The three cases of a 
boat being rowed in a river 
with an intrinsic speed c, the 
river flowing (from left to 
right) with speed v: (a) in the 
direction of the river flow, 

(b) opposite to that direction 
and (c) in a direction 
perpendicular to the flow of 
the river. 



(b) 


v 



the time for the first trip exceeds that for the second by 

d d 2 d 

x = -4--— — = 

c-v C + V ^(c2 _ v 2 ) 

and, for small current speeds (v <^C c), we get the answer as 


( 1 . 1 ) 


The Michelson-Morley experiment [1] used the Michelson interfer¬ 
ometer and is schematically described by Figure 1.2. Light from a source 
S is made to pass through an inclined glass plate cum mirror P. The plate 
is inclined at an angle of 45° to the light path. Part of the light from the 
source passes through the transparent part of the plate and, travelling a 
distance d\, falls on a plane mirror A, where it is reflected back. It then 
passes on to plate P and, getting reflected by the mirror part, it moves 
towards the viewing telescope. A second ray from the source first gets 
reflected by the mirror part of the plate P and then, after travelling a 
distance di, gets reflected again at the second mirror B. From there it 
passes through P and gets into the viewing telescope. 

Now consider the apparatus set up so that the first path (length d\) is 
in the E-W direction. In a stationary aether the surface of the Earth will 
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Fig. 1.2. The schematic 
arrangement of Michelson's 
interferometer, as described in 
the text. 
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have a velocity approximately equal to its orbital velocity of 30 km/s. 
Thus (v/c) 2 is of the order of 10~ 8 . In the actual experiment the apparatus 
was turned by a right angle so that the E-W and N-S directions of the 
arms were interchanged. So the calculation for the river-boat crossing 
can be repeated for both cases and the two times added to give the 
expected time difference as 


^ d\ -T ^2 

r = - 

c 



(13) 


Although the effect expected looks very small, the actual sensitivity 
of the instrument was very good and it was certainly capable of detecting 
the effect if indeed it were present. The experiment was repeated several 
times. In the case that the Earth was at rest relative to the aether at the 
time of the experiment, six months later its velocity would be maximum 
relative to the aether. But an experment performed six months later also 
gave a null result. 

The Michelson-Morley experiment generated a lot of discussion. 
Did it imply that there was no medium like aether present after all? 
Physicists not prepared to accept this radical conclusion came up with 
novel ideas to explain the null result. The most popular of these was the 
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notion of contraction proposed by George Fitzgerald and later worked 
on by Hendrik Lorentz. Their conclusion as summarized by J. Larmour 
in a contemporary (pre-relativity) text on electromagnetic theory reads 
as follows: 

... if the internal forces of a material system arise wholly from 
electromagnetic actions between the systems of electrons which constitute 
the atoms, then the effect of imparting to a steady material system a 
uniform velocity [v] of translation is to produce a uniform contraction of 
the system in the direction of motion, of amount (1 — u 2 /c 2 )... 

It is clear that a factor of this kind would resolve the problem posed 
by the Michelson-Morley experiment. For, by reducing the length trav¬ 
elled in the E-W direction by the above factor, we arrive at the same 
time of travel for both directions and hence a null result. Lorentz went 
further to give an elaborate physical theory to explain why the Fitzgerald 
contraction takes place. 

The Michelson-Morley experiment was explained much more ele¬ 
gantly when Einstein proposed his special theory of relativity. We will 
return to this point after decribing what ideas led Einstein to propose 
the theory. As we will see, the Michelson-Morley experiment played no 
role whatsoever in leading him to relativity. 

1.3 The invariance of Maxwell's equations 

We now turn to Einstein’s own approach to relativity [2], which was moti¬ 
vated by considerations of symmetry of the basic equations of physics, 
in particular the electromagnetic theory. For he discovered a conflict 
between Newtonian ideas of space and time and Maxwell’s equations, 
which, since the mid 1860s, had been regarded as the fundamental equa¬ 
tions of the electromagnetic theory. An elegant conclusion derived from 
them was that the electromagnetic fields propagated in space with the 
speed of light, which we shall henceforth denote by c. It was how this 
fundamental speed should transform, when seen by two observers in 
uniform relative motion, that led to the conceptual problems. 

The Newtonian dynamics, with all its successes on the Earth and 
in the Cosmos, relied on what is known as the Galilean transformation 
of space and time as measured by two inertial observers. Let us clarify 
this notion further. Let O and O' be two inertial observers, i.e., two 
observers on whom no force acts. By Newton’s first law of motion both 
are travelling with uniform velocites in straight lines. Let the speed of 
O' relative to O be v. Without losing the essential physical information 
we take parallel Cartesian axes centred at O and O' with the X, X' axes 
parallel to the direction of v. We also assume that the respective time 
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coordinates of the two observers were so set that t = t' = 0 when O and 
O' coincided. 

Under these conditions the transformation law for spacetime vari¬ 
ables for O and O' is given by 

t' = t, x' = x — vt, y' = y, z' = z. (1.4) 

Since v is a constant, the frames of reference move uniformly relative 
to each other. Laws of physics were expected to be invariant relative to 
such frames of reference. For example, because of constancy of v, we 
have equality of the accelerations x and x'. Thus Newton’s second law of 
motion is invariant under the Galilean transformation. Indeed, we may 
state a general expectation that the basic laws of physics should turn out 
to be invariant under the Galilean transformations. This may be called 
the principle of relativity. 

Paving the way to a mechanistic philosophy, Newtonian dynamics 
nurtured the belief that the basic laws of physics will turn out to be 
mechanics-based and as such the Galilean transformation would play a 
key role in them. This belief seemed destined for a setback when applied 
to Maxwell’s equations. Maxwell’s equations in Gaussian units and in 
vacuum (with isolated charges and currents) may be written as follows: 

1 3B 

V • B = 0; V x E =-; 

c 3 1 

1 3D 4 7r (L5) 

V • D = 4?r p; V x H =-1-j. 

c dt c 

Here the fields B, E, D and H have their usual meaning and p and 
j are the charge and current density. We may set D = E and B = H in 
this situation. Then we get by a simple manipulation, in the absence of 
charges and currents, 


VxVxH=VV-H — V 2 H 


13 1 3 2 H 

=-V x E = - — 


c dt 


dt 2 


From this we see that H satisfies the wave equation 


( 1 . 6 ) 


□H = 0. (1.7) 

Similarly E will also satisfy the wave equation, the operator □ standing 
for 


1 3 2 , 

□ =-V 2 . 

C 2 dt 2 
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The conclusion drawn from this derivation is this: Maxwell’s equa¬ 
tions imply that the E and H fields propagate as waves with the speed c. 
Unless explicitly stated otherwise, we shall take c = l. 1 

However, this innocent-looking conclusion leads to problems when 
we compare the experiences of two typical inertial observers, having 
a uniform relative velocity v. Suppose observer O sends out a wave 
towards observer O' receding from him at velocity v directed along 
00'. Our understanding of Newtonian kinematics will convince us that 
O' will see the wave coming towards him with velocity c — v. But then 
we run foul of the principle of relativity, that the basic laws of physics 
are invariant under Galilean transformations. So Maxwell’s equations 
should have the same formal structure for O and O', with the conclusion 
that both these observers should see their respective vectors E and H 
propagate across space with speed c. 

This was the problem Einstein worried about and to exacerbate it he 
took up the imaginary example of an observer travelling with the speed 
of the wave. What would such an observer see? 

Let us look at the equations from a Galilean standpoint first. The 
Galilean transformation is given by 

r' = r —vf, t' = t. (1.8) 

Although the general transformation above can be handled, we will 
take its simplifed version in which O' is moving away from O along the 
x-axis and O and O' coincided when t' = t = 0. It is easy to see that the 
partial derivatives are related as follows: 

9 _ 9 9 _ 9 9 _ 9 9 _ 9 9 

dx dx' ’ dy dy' ’ dz 9z' ’ dt dt' dx' 

If we apply these transformation formulae to the wave equation 
(1.7), we find that the form of the equation is changed to 

<l<>) 

Clearly Maxwell’s equations are not invariant with respect to 
Galilean transformation. Indeed, if we want the equations to be invariant 
for all inertial observers, then we need, for example, the speed of light 
to be invariant for them, as seen from the above example of the wave 
equation. Can we think of some other transformation that will guarantee 
the above invariances? 

In particular, let us ask this question: what is the simplest modifi¬ 
cation we can make to the Galilean transformation in order to preserve 

1 In this book, as a rule, we will choose units such that the speed of light is unity when 
measured in them. 
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the form of the wave equation? We consider the answer to this question 
for the situation of the two inertial observers O and O' described above. 
We try linear transformations between their respective space and time 
coordinates ( t,x,y , z) and (/', x', y', z') so as to get the desired answer. 
So we begin with 

t' = a 00 t + a 0 ix, x' = a lo t + a n x, y' = y, z’ = z. (1.10) 


With this transformation, it is not difficult to verify that the wave 
operator □ transforms as 


□ = 


( d 9 

[ aw d? +ai0 M 


2 


9 9 \ 2 9 2 

aoi d? + ctu w) ~ ~dyi 


9z' 2 


( 1 . 11 ) 


A little algebra tells us that the right-hand side will reduce to the 
wave operator in the primed coordinates, provided that 


£Zai - 1, 


— ( 37 , = — 1 , 


anaoi = tfio^oo- 


( 1 . 12 ) 


Now, if we assume that the origin of the frame of reference of O' is 
moving with speed v with respect to the frame of O, then setting x' — 0 
we get van — — «io- Then from (1.12) we get a 0 i = — uaoo- Finally we 
get the solution to these equations as 


a n = Y, a lo = -vy = a ol , a 00 = y, (1.13) 


where 


y = (1 - v 2 )~ l/2 . (1.14) 

Thus the transformation that preserves the form of the wave equa¬ 
tion is made up of the following relations between (t,x,y,z) and 
( t', x', y', z'), the coordinates of O and O', respectively: 

t' = y(t — vx), x' = y(x — vt), y' = y, z' = z. (1.15) 

It is easy to invert these relations so as to express the unprimed 
coordinates in terms of the primed ones. In that case we would find that 
the relations look formally the same but with +u replacing —v: 

t = y{t' + vx'), x = y(x' + vt'), y = /, z = z'. (1.16) 

Physically it means that, if O' is moving with speed v relative to O, 
then O is moving with speed —v relative to O'. 

A more elaborate algebra will also show that the Maxwell equations 
are also invariant under the above transformation. 

Einstein arrived at this result while considering the hypothetical 
observer travelling with the light wavefront. He found that such an 
observer could not exist. (This can be seen in our example below by 



1.4 The origin of special relativity 9 


letting v go to c = 1.) In the process he arrived at the above transfor¬ 
mation. As we will shortly see, this transformation has echoes of the 
work Lorentz had done in his attempts to explain the null result of the 
Michelson-Morley experiment. We will refer to such transformations 
by the name Lorentz transformations, the name given by Henri Poincare 
to honour Lorentz for his original ideas in this field. 

We also see that the space coordinates and the time coordinate get 
mixed up in a Lorentz transformation. Thus, for a family of inertial 
observers moving with different relative velocities, we cannot compart¬ 
mentalize space and time as separate units. Rather they together form a 
four-dimensional structure, which we will henceforth call ‘spacetime’. 


Example 1.3.1 Consider (1.15) with the following definition of 9: 

v = c tanh 8. 

Then trigonometry leads us to the following transformation laws: 

t' = t cosh 8 — x sinh 8, x' = x cosh 8 — t sinh 8, y = y, 
z' = z. 

Compare the first two relations with the rotation of Cartesian axes x,y in 
two (space) dimensions: 

x' = x cos 8 — y sin 9, y' = y cos 9 + x sin 9. 

We may therefore consider the Lorentz transformation as a rotation through 
an imaginary angle id, if we define an imaginary time coordinate as T = it. 


1.4 The origin of special relativity 

Einstein thus found himself at a crossroads: the Newtonian mechanics 
was invariant under the Galilean transformation, whereas Maxwell’s 
equations were invariant under the Lorentz transformation. One could 
try to modify the Maxwell equations and look for invariance of the 
new equations under the Galilean transformation. Alternatively, one 
could modify the Newtonian mechanics and make it invariant under the 
Lorentz transformation. Einstein chose the latter course. We will now 
highlight his development of the special theory of relativity. 

We begin with the introduction of a special class of observers, the 
inertial observers in whose rest frame Newton’s first law of motion holds. 
That is, these observers are under no forces and so move relative to one 
another with uniform velocities. Notice that there is no explicitly defined 
frame that could be considered as providing a frame of ‘absolute rest’. 
Thus all inertial observers have equal status and so do their frames, 
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which are the inertial frames. This is in contrast with the Newtonian 
concept of absolute space, whose rest frame enjoyed a special status. 
We will comment on it further in Chapter 18 when we discuss Mach’s 
principle. 

The principle of relativity states that all basic laws of physics are the 
same for all inertial observers. Notice that this principle has not changed 
from its Newtonian form; but the inertial observers are now linked by 
Lorentz rather than Galilean transformations. 

When applied to electricity and magnetism this principle tells us 
that Maxwell’s equations are the same for all inertial observers: in par¬ 
ticular, the speed of light c, which appears as the wave velocity in these 
equations, must be the same in all inertial reference frames. We also see 
that this requirement leads us to the Lorentz transformation. The trans¬ 
formation described by the equations (1.15) is called a ‘special Lorentz 
transformation’. It can be easily generalized to the case in which the 
observer O' moves with a constant velocity v in any arbitrary direction. 
The relevant relations are 

t' = y [t-{y. r)], r' = y(r* — \t), (1.17) 


where 


r* = r/y + (y — l)v(v- r)/yv 2 . (1.18) 

We next look at some of the observable effects of this transformation 
on some measurements of events in space and time. For it is these effects 
that tell us what the special theory of relativity is all about. 


Example 1.4.1 Problem. Show that (1.17) reduces to (1.15) for a special 
Lorentz transformation. 

Solution. In the special Lorentz transformation, v is in the x -direction. So, if 
e is a unit vector in that direction, 

v • r = vx, r* = r\/l — v 2 + ( . — 1 ) v 2 x —-e, 

VvT~^ ) yv 2 

where we have used (1.17) and (1.18). Thus t' — y(t — vx), which is as per 
(1.15). For the r' relation, note that the y—y' relation isy' = y. Similarly we 
have z' = z. The x—x' relation is 

, , 1 
x = x + y(y — l)n ■ —- — yvt 
yv 2 

= x(l + y — 1) — yvt — y(x — vt). 

Thus we recover the special Lorentz transformation. 
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1.5 The law of addition of velocities 


First of all, we notice that the speed of light remains c for all inertial 
observers. The Michelson-Morley experiment would therefore give zero 
difference in time gap, not a finite one as was then calculated on the basis 
of Newtonian kinematics. What does the Lorentz transformation do to 
the law of addition of velocities? 

Let us talk of three inertial observers O, O' and O" with frames of 
reference aligned so that they share the same x -direction while their 
origins were coinciding at t — t' — t" — 0. We are given that O' is 
moving in the x-direction with velocity v\ relative to O. Likewise, O" 
is moving with velocity i >2 relative to O'. So, what is the velocity of O" 
relative to O? The Newtonian answer to this question would have been 
v\ + t> 2 . Here, however, the result is different. 

The Lorentz transformation relating O' to O is 


, X — 111 t , t — VlX 

x= 7 T^T t = 7^' 

Likewise, the Lorentz transformation linking O" to O' is 


(L19) 


X' — t>2 1' , t' — V2X’ 

\/ 1 ~ V 2 \ J X ~ V 2 


(L20) 


Our desired answer is found by combining equations (1.19) and 
(1.20) so as to express the coordinates (t", x") in terms of (t, x). The 
algebra is simple but a bit tedious and the answer is that O" moves 
relative to O as an inertial observer whose velocity v in the x-direction 
relative to O is given by 


t>l + l>2 

1 + H1H2 


( 1 . 21 ) 


This is the law of addition of velocities. If the velocities are Vi, V 2 
parallel in any general direction, the formula becomes 


Vi + v 2 

v = -. 

1 + Vl • v 2 


( 1 . 22 ) 


From (1.21) it is easy to see that, if one of the velocities is c = 1, the 
resultant is also c. Thus, irrespective of whether a source of light is at 
rest or moving relative to an observer, the light emitted by it will always 
have the speed c as measured by the observer. 


Example 1.5.1 Problem. Relate (1.21) to rotation of axes by imaginary 
angles. 

Solution. From Example 1.3.1, we have V\ = tanhd], ii 2 = tanh 62 , where 
i#i and \0 2 are the rotation angles from frame O to O' and from O' to O", 
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respectively. Thus the rotation angle from frame O to O is simply \{9\ + 8 2 ). 
The corresponding net velocity is 


v = tanhfdj + 8 2 ) — 


tanh 0\ + tanh 9 2 
1 + tanh 9\ tanh 9 2 


l>i + v 2 
1 + Hi v 2 


1.5.1 The Minkowski spacetime 

It was Hermann Minkowski [3], one of Einstein’s teachers and a distin¬ 
guished mathematician, who brought elegance into the above picture by 
pointing out that it was wrong to think of time and space as two separate 
entities; one should learn to look at the unity of the two. In his own 
words, 

The views of space and time which I wish to lay before you have sprung 
from the soil of experimental physics, and therein lies their strength. They 
are radical. Henceforth space by itself, time by itself, are doomed to fade 
away into mere shadows and only a kind of union of the two will preserve 
an independent reality. 

Thus one is really dealing with ‘spacetime’ instead of with ‘space’ 
and ‘time’. We may use the notation advocated by Minkowski by replac¬ 
ing the time t and the Cartesian space coordinates (x,y,z) by their 
four-dimensional counterparts: 

x° = ct, x 1 =x, x 2 = y, x 3 =z. (1-23) 

A Lorentz transformation may therefore be looked upon as a linear 
spacetime coordinate transformation of the kind below: 

x* =52 Ai k x k = 4x k . (1.24) 

k 

Here, in the last step, we have dropped the summation symbol in 
summing over ‘A'’. Our rule henceforth will be that any expression 
containing an upper and lower index represented by the same Latin 
letter is automatically summed for all four values of that letter. The 
summation in this case is over all values of the index k, viz. 0, 1, 2, 3. 

The condition that light has the same velocity in all inertial reference 
frames may be translated into the invariance of ( c 2 t 2 — x 2 — y 2 — z 2 ) 
in the above transformation, and we may compare this situation with 
the three-dimensional one where the length square (x 2 + y 2 + z 2 ) is 
preserved under coordinate transformation. It is customary to write the 
above four-dimensional square of the distance from the origin in our 
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new notation, as 

rUkx'‘x' k = n lk A l m x m A k n x* = r, mn x"'x n , 

which leads us to the general transformation laws of these frames. The 
transformation coefficients must satisfy the rule given below: 

(!- 2 5) 

The 4x4 array r] jk is the key factor specifying the measurement of 
distance in the four-dimensional spacetime and is called the Minkowski 
metric. It has the simple form diag(+l, —1, —1, —1). The transforma¬ 
tions which satisfy the above rule form a group called the Lorentz group 
and denoted by 0(1,3). 

We define a vector P' as a four-component entity that transforms 
under the coordinate transformation (1.24) as follows: 

P n = A‘ m x P"‘. (1.26) 

Likewise, a second-rank tensor F lk transforms according to the law 

F' ik = A‘ m A k n x F mn . (1.27) 

We may use the metric r]a to lower an upper index to a lower one, e.g., 

A 1 x r) ik = A k . 

Likewise we take the inverse matrix of \\r)ik\\ to be 11| so that 

iVkif = s' r 

Here the delta is the Kronecker delta, which is zero unless the upper 
and lower index happen to be equal, when its value is unity. It is easy to 
see that the inverse of the above index-lowering exercise is 

A t x rf k = A k . 

In addition to vectors and tensors, we have scalars, which retain the same 
value under all coordinate transformations. Thus ijtkP' P k is a scalar, as 
can be seen by subjecting it to a Lorentz transformation. A scalar does 
not have any free index. 

If a vector A k is given we can make a scalar out of it by writing 

A 1 = ri ik .A t .A k . 

A is called the magnitude of the vector. If A 2 > 0 the vector A k is said 
to be ‘timelike’. For A 2 — 0 we have in A k a ‘null’ vector, while for 
A 2 < 0 we have a ‘spacelike’ vector. 

We will use this notation in an extended form (see Chapter 3) when 
dealing with general relativity. In general we may say that any expres¬ 
sion written in terms of vectors and tensors (and, of course, scalars) is 
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guaranteed to be Lorentz invariant. Since the special theory of relativity 
requires all physics to be the same for all inertial observers, we need all 
basic physical equations to be Lorentz invariant also. Thus we expect all 
fundamental physics to be expressed in terms of vectors and tensors. 2 


Example 1.5.2 A tensor is symmetric if any permutation of its indices 
does not alter its value. Likewise an antisymmetric tensor reverses its value 
for any odd permutation of its indices. 

An example of the latter is c,,«, defined by 

{ + 1 if (i, j, k, /) is an even permutation of (1,2, 3, 4) 

— 1 if j , k, /) is an odd permutation of (1,2,3, 4) 

0 otherwise. 

We will encounter this tensor in Chapter 3 in the context of general coordinate 
transformations. 


1.6 Lorentz contraction and time dilatation 

The Lorentz transformation generates several paradoxical situations 
largely because one is normally and intuitively tuned to absolute space 
measurements and absolute time measurements. We will describe some 
examples next. For our discussion we will take the Lorentz transforma¬ 
tion to be as given in Equations (1.15) and (1.16). 

1.6.1 Length contraction 

Let us consider the following experiment. Let O' carry a rod of length / 
as measured by him when the rod is at rest in his reference frame, that 
is, the frame in which he is at rest. Suppose the rod is laid out along the 
x'-axis with front end B at x' = l and back end A at x' = 0 as shown in 
Figure 1.3. Suppose that O sets up an experiment of timing when the two 
ends A and B pass his origin x = 0. Evidently the expectation based on 
Newtonian physics is that the two ends will pass the origin at an interval 
of l/v. Let us see what the Lorentz transformation (1.15) gives. 

When the end B passes the origin of O, as shown in Figure 1.3(a), 
we have x' —1, x = 0 and the second of the equations listed in (1.15) 
gives / = y(x — vt), i.e., forx = 0 we get t — — l/(yv ). Likewise when 
the end A passes the origin as shown in Figure 1.3(b), the time in the 
frame of O is t — 0. The time interval between the passages of A and 
B is therefore l/(yv). Thus, knowing that O' is moving with speed v 

2 The inputs provided by quantum mechanics have led to the addition of spinors to this 
list. We will not have any opportunity to discuss the role of spinors in this text. 
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x' = 0 x' = l Fig. 1.3. Two stages (a) and 

A B (b) of the movement of the 

rod AB as observed by O, as 
per the arrangement 
described in the text. 


x = 0 
i = -//(yv) 


(a) 


(b) 


o 


O 

II 

x' = l 

A 

B 

x = 0 


t = 0 



x 


x 


relative to him, O will conclude that the length of the rod is l/y. Thus 
the rod appears to him contracted by the factor Vl — v 2 . 

While we may be tempted to identify this with the Fitzgerald- 
Lorentz idea that an object moving with speed v relative to the aether 
will contract by just this factor in its direction of motion, such an iden¬ 
tification is wrong. The aether contraction was an absolute effect and 
explained as such on the basis of certain assumptions about the atomic 
structure of matter. Here the effect is relative. Moreover, to observer O' 
a similar rod carried by O would appear contracted by the same factor. 
In short, the effect is symmetric between the two inertial observers. 
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Fig. 1.4. Two stages in the 
passage of the moving clock 
of O' are shown above. As 
described in the text, at stage 
(a) O' is passing the origin of 
the stationary observer O 
whereas in stage (b) this clock 
is observed by another 
stationary clock at D. 


r' = 0 



HO 




1.6.2 Time dilatation 

Consider next another experiment in which a clock is taken by O' at its 
origin, and, as it passes the origin of O as well as another specified spot D 
at x — h, say, the times in the clock are recorded. Also, O has stationed 
clocks at the origin as well as at D. These clocks record their times as the 
moving clock from O' passes them by. Again our Newtonian expectation 
is that the time interval recorded by the two clocks of O should equal 
the time interval between the two readings by the clock of O'. Let us see 
what the reality is. Figure 1.4 illustrates the experiment. 

The clock of O' has coordinate x' = 0. When it passes the origin 
of O, with coordinate x — 0, we have from the Lorentz transformation 
that t = t' — 0. This is shown in Figure 1.4(a). Similarly, as shown in 
Figure 1.4(b), when the clock of O' is at D, we have x' = 0, x = h so 
that the times when these clocks meet are t — h/v and t' = {y)~ l h/v. 
Thus we see that the interval recorded by the moving clock of O' is short 
by a factor Vl — v 2 compared with the times recorded by the two clocks 
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of O. So on the basis of this experiment O will conclude that the clock 
kept by O' is running slow compared with his clock. This effect is often 
referred to as time dilatation. 

However, this experiment could be performed with O' using two 
clocks and O using one, with the roles of the two observers interchanged. 
Then O' would find that the clock system used by O is running slow. 
The experiment as such is not designed such that a symmetric role is 
played by each inertial frame. Thus there is no paradox involved here. 
Neither is it the case that one inertial frame is accorded a special status. 
A look back at the length-contraction experiment described earlier will 
likewise show that there too no special status is enjoyed by any inertial 
frame. 


1.6.3 Muon decay in cosmic-ray showers 

A striking demonstration of time dilatation was given by observations 
of the muon particles (p) in cosmic-ray showers. The muon (earlier 
called the p meson) is very similar to the electron, except that it is 207 
times heavier. The particle is normally unstable and in the laboratory 
rest frame its decay time is as low as (2.09 ± 0.03) x 10~ 6 s. It decays 
into the electron, a muon neutrino and an electron anti-neutrino. 

In cosmic-ray showers the presence of muons would normally be 
difficult to understand. For they are believed to have been produced at a 
height of 10 km and, even if they travelled with the speed of light, they 
could not travel more than a distance of some 600 m before decaying. 
So how did they manage to survive long enough to be seen so close to 
Earth’s surface? 

The puzzle is explained by arguing that, if the muons are travelling 
with very high speed, their natural clocks as seen by observers on the 
Earth would run slow. At a speed of 0.995c the time-dilatation factor 
Y is equal to 10. This makes the apparent lifetime of the muon relative 
to an Earth-based laboratory ten times longer and thus it can travel up 
to 6 km, instead of 600 m. Therefore in a normal statistical fluctuation 
we should be able to see some muons coming from even 10 km above 
sea-level. 


1.6.4 Relativity of simultaneity 

In Newtonian physics the absolute character of space and time enabled 
one to give an absolute meaning to the simultaneity of two events at 
different places. Thus, if event A took place at x — x\ and event B at 
x = X 2 , both at time t, then they would be called simultaneous. 



18 The special theory of relativity 


In relativistic physics, the above statement needs to be modified. We 
can describe the above simultaneity as that seen by an inertial observer 
O. To another inertial observer O' as defined earlier, the events would 
not be simultaneous. For event A will occur at 

t' = y(t — x\v), 

whereas event B will occur at 

t' = y(t — x 2 v). 

So the events are not simultaneous in the reference frame of O'. As 
seen by O', A will occur before or after B depending on whether X\ 
exceeds or is less than x 2 . 

These examples illustrate the tricks time can play with our intuitive 
perception of physical events. There is another paradox, which we will 
discuss towards the end of this chapter: the so-called clock paradox. 
Before considering that, we need to discuss how Newtonian mechanics 
has to be modified in order to accommodate Lorentz invariance, that is, 
invariance with respect to Lorentz transformation. 


1.7 Relativistic mechanics 


Let us define the spacetime trajectory of a particle of mass m at rest, 
denoting its coordinates x l = (t, x, y, z). Such a trajectory is often called 
the world line of the particle. The three space coordinates are usually 
denoted by a space vector r. In Newtonian physics the three-dimensional 
velocity of the particle of mass m is denoted by 

dr 

dt ~ V 

Here, in relativity, we define a 4-velocity of the particle of mass m 
by 


dx* 

ds 


(1.28) 


Here d.v is the ‘proper distance’ between the two points (t. x, y, z) 
and (t + dt, x + dx, y + d y, z + dz) on the spacetime trajectory of 
the particle. Using the fact that 


ds’ 2 = d? 2 - (dx 2 + dy 2 + dr 2 ) = dt 2 (l - v 2 ) = dt 2 /y 2 (1.29) 


we get the following relationship between the Newtonian 3-velocity and 
the relativistic 4-velocity: 


u' — y x [1, v]. 


(1.30) 
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Notice that the identity (1.29) shows that the four components of u' are 
related and so we have only three independent components. The identity 
may be written as 

did 

uUi — 1. i.e., Uj x — = 0. (1.31) 

ds 

With this basic definition, we now considerNewton’s laws of motion. 
The first law remains as it is. The second law needs some consideration 
to make it Lorentz-invariant, i.e., invariant under Lorentz transforma¬ 
tion. How should force and acceleration be related? In the rest frame of 
the moving body, suppose its mass is m o- We assume that the difference 
introduced by the Lorentz invariance would show up only at large veloc¬ 
ities and so the second law should look the same as the Newtonian one 
in the rest frame of the body. This may be written as 



(1.32) 


where F is the measure of force in the rest frame of the body. Now, 
we ask, what is the acceleration formula in a general Lorentz frame, 
which reduces to the above in the rest frame? To this end we first convert 

(1.32) into a set of four equations instead of three, by adding the fourth 
component, and then replace dt by ds. Thus we have the second law as 



(1.33) 


where the 4-force G‘ is related to the Newtonian force in the following 
way. Take the spacelike components of the above equation. We get 


d t' d / At’ dr' \ 

ds d t' \J”° d.s At' J 


= G', 


say. Here we have assumed that the spacetime coordinates in the general 
Lorentz frame are it' , r'). This looks like Newton’s second law with 
minimal modification, if we write it as 



(1.34) 



This then is the modified version of Newton’s second law of motion. 
We can write it in the four-dimensional language of Minkowski as fol¬ 
lows: 



(1.36) 
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where G 1 , G 2 , G 3 are the spacelike components which we write as the 
3-vector G and P — G° is the timelike fourth component. From the 
above equation we see that this ‘extra’ component of the relativistic 
equations of motion implies 


d 

ds 



= P. 


(1.37) 


We will next study its implication for the mass-energy relation. 


1.7.1 The mass-energy relation 

Let us re-examine the four-dimensional status given to the force. We 
have three components in the form of the vector G defined above as Fy. 
We then have a fourth (time) component, P. Writing the 4-vector for 
force as 


F‘ = [P, G], 

we use the invariance of the expression 

. , df dr 

r hk u , F k = P-- G-. (1.38) 

ds ds 

We note two things. First, from (1.31), we get the left-hand side of 
(1.38) as zero. In the rest frame we therefore have P — 0. Secondly, 
because (1.38) vanishes in one inertial frame, it must do so in all inertial 
frames. Hence we get 

P = Gv. (1.39) 


This means that P/y is F • v, which is the rate of working of the 
force F. If we denote by W the energy generated in the particle by these 
external forces, then 


dW d W d t 

- = - x — = P/v x y = P. 

ds df ds 


(1.40) 


However, because of the equation (1.37) satisfied by P, we can write 
the above as 


dW 

ds 


d 

ds 


(ym o). 


(1.41) 


which, in view of Equation (1.35), integrates to 


W = m + constant. 


The energy generated by external forces being one of motion, we 
interpret W as kinetic energy and so require it to be zero for the particle 
at rest. Hence we adjust the constant such that 

W — (m — m 0 )c 2 , 


(1.42) 
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where we have restored c (which had been put equal to unity) so as to 
remind ourselves of the difference in dimensionality of mass and energy. 
Arguing that a particle of mass mo at rest also possesses energy moc 2 , 
Einstein came up with the total energy of a moving particle as 

E = me 1 . (1.43) 

This is perhaps the best-known equation of physics, especially if we take 
lay non-physicists also into consideration! A useful application of this 
result is found in the Sun. The core of the Sun has hot plasma in which 
four protons (i.e., nuclei of hydrogen atoms) combine together to make 
a helium nucleus, through the reaction 

4 ‘H —► 2 He + 2v + 2e+ 4- y. 

That is, the byproducts are neutrinos, positrons and radiation. The 
mass of the four hydrogen nuclei exceeds the mass of the helium pro¬ 
duced. The difference in mass Am is not lost but is produced in the form 
of energy Am c 2 . Even after accounting for neutrinos and positrons, 
the bulk of this energy appears as solar radiation. Thus the Sun shines 
because of E — mc 2 \ 

1.7.2 The linear momentum 

In terms of Minkowski’s four-dimensional framework, we can write 
some of the above relations as follows. We start by defining the energy- 
momentum 4-vector for a particle of rest mass mo as 

( d? dr \ 

P' = ( m ° ds’ m ° ds ) = ^ E ' P '*’ < L44 ) 

where p = m v is the Newtonian 3-momentum of the particle. Since the 
Minkowski 4-velocity has unit magnitude, we have 

E 2 — p 2 = «(, (1.45) 

where we have put back the condition c — 1. 

In general interactions between subatomic particles lead to decay, 
scattering, etc. of particles and there the conservation of 4-momentum 
generally holds. We will discuss a couple of examples from electrody¬ 
namics to illustrate how the law operates. 

But first we specify what happens to the photons, the particles of 
light, vis-a-vis Equation (1.45). The photons always travel with the speed 
of light and so one cannot really talk of their ‘rest’ mass. Nevertheless, 
a common misnomer is that photons, or all those particles which travel 
with the speed of light, have a ‘zero rest mass’. So if a photon y of 
frequency v is travelling in a direction specified by a unit (spacelike) 
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vector e its energy-momentum vector is defined as 

P y = [hv,hve]. (1-46) 

It is easy to see that the magnitude of this momentum is zero, that 
is P y • P y — 0. The photon therefore has ‘zero’ 4-momentum and it is 
described by a null vector. 

1.7.3 The centre-of-mass frame 

Consider the situation in electrodynamics in which a particle- 
antiparticle pair, such as an electron and a positron, is created from 
radiation. Let us see whether we can create the pair from a single photon. 

It is convenient to look at the problem from a special inertial frame: 
one in which the total 3-momentum of all participating particles is zero. 
Such a frame is called the centre-of-mass frame, because in this frame 
the centre-of-mass is at rest. How to go from an arbitrary inertial frame 
to a centre-of-mass frame is shown in the two solved problems at the 
end of this subsection. 

So we take the electron and positron each to have the same rest mass 
Wo but equal and opposite 3-velocities ±v. Thus the 4-momenta for the 
two particles are 

electron e~ : P~ = ym 0 x [1, v]; positron e + : P + = ym 0 x [1, —v]. 

The factor y has its usual meaning. But, when we add the two momenta, 
we get the total momentum as 

P~ + P + = 2 yrn 0 x [1,0,0, 0], 

If this pair had to have come from just one light photon, the photon 
must have had the above momentum. But this is impossible, since the 
photon momentum is a null vector, whereas the above vector is timelike. 
We can satisfy the above requirement of conservation of 4-momenta if 
we have two photons to work with. 


Example 1.7.1 Problem. Consider the reaction 7t + + n — >• A 0 + K + . The 
rest masses of the four particles are, respectively, m„, m n , m A and m K . The 
neutron was at rest in the laboratory frame. Show that the total energy in 
the centre-of-mass frame is (m 2 n + m\ + 2m n E n ) l/2 , where E n is the pion 
energy. 

Solution. The total 4-momentum is P' , where 

p 1 = PL + Pi = PL + Pl 
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In the centre-of-mass frame, the total energy is and the total 3- 

momentum is 0. Since P,P' is invariant in all frames, we may equate its 
values in the lab frame and in the centre-of-mass frame. In the lab frame, 

PP- - (p; + p-)(p, n + p in ) 

= m 2 n + + 2P‘ n P iri from (1.45). 

Now P^ = (E n , p m ), Pi = (E n , 0) and so P' n Pm = E n E n . But E n = m„ 
since the neutron is at rest. So in the lab frame P,P' = m 2 n + m\ + 2m n E n . 
In the centre-of-mass frame the total 4-momentum is (P,™, 0). 

Hence P,P' = (P t ™) 2 = m 2 n + m 2 n + 2m n E n . 


Example 1.7.2 Problem. Two particles of rest masses m \ and m 2 collide. 
Prior to collision the first particle was at rest while the second was approach¬ 
ing it with speed t> 2 • What is the energy of the system in the centre-of-mass 
frame? 

Solution. In the laboratory frame, the 3-momentum has only one component 
y 2 m 2 v 2 in the direction of motion. The energy of the system in this frame is 
m i + y 2 m 2 , where y 2 l = \J\ — v\. 

Let the total energy in the centre-of-mass frame be E 0 . The total 3- 
momentum is zero by definition. Equating P, P' in the two frames, we get 

(/«i + y 2 m 2 f - y 2 m\v\ = E%, 

i.e., Pq = m/j + ml + 2y 2 ni\m 2 . 


1.7.4 Compton scattering 

We look at another example from electrodynamics shown in Figure 1.5. 
We have a photon of frequency iq incident on an electron at rest. The 
electron acquires some momentum from this impact and moves with 
speed u in a direction making an angle 6 with the original direction of the 
photon. The photon after the impact is scattered in the direction making 
an angle — (p with its original direction of motion. We want to relate the 
frequency r >2 of the scattered photon to the angle of its scattering. 

It can be easily verified that the entire scenario is confined to a plane, 
that determined by the paths of the original and scattered photons. Let mo 
and m denote the rest mass and moving mass of the electron before and 
after the impact. The 4-momenta of the various particles are as follows. 

(1) Before scattering: 

electron momentum P e = m 0 x [1,0, 0, 0], 
photon momentum P Y = hv t x [1, 1, 0, 0] 
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Fig. 1.5. A schematic 
diagram of Compton 
scattering. The photon y 
coming from the left impacts 
the stationary electron e~ and 
scatters it in the direction 
making angle 9 with the 
original direction of y. The 
photon itself is scattered in 
the direction making angle 0 
with its original direction. 



(2) After scattering: 

electron momentum P e ' = m x [1, u cos 9, u sin 6, 0] 
photon momentum P' = hv 2 x [1, cos (f>, — sin <f>, 0]. 

The law of conservation of 4-momenta then reduces to the following 
three equations: 


mo + hv 1 = m + I 1 V 2 

hv\ = hv 2 cos 4> + mu cos 9 (1-47) 

0 = —hv 2 sin tp + mu sin 9. 

Upon eliminating 0 from these relations, we get 
m 2 u 2 = h 2 [{vi — v 2 cos (j>) 2 + v 2 sin 2 <f>]. 

Using the relation m = mo/y/ 1 — u 2 and the first of the three equa¬ 
tions in (1.47), we have finally the relation 

1 1 h 

-= —(1 — COS <t>). 

v 2 V\ m 0 

That is, in terms of wavelengths, 

h 

k 2 — A.i = —(1 — cos 4>). 
m 0 

This is known as Compton scattering because the change of wave¬ 
length of the photon on scattering was first measured by A. H. Compton 
in 1923. The signature of the effect is that the change in wavelength does 
not depend on the initial wavelength of the photon. It is dependent only 
on the angle of scattering. The quantity h/(moc) is usually referred to 
as the Compton wavelength of the electron. Its value is 0.0024 nm. 
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Particle-particle collisions, or decays, are phenomena that have been 
used extensively to test the validity of the assumptions that lead to special 
relativity. A few more cases are described in solved problems that follow 
as well as in the exercises at the end of the chapter. 


Example 1.7.3 Problem. A particle with rest mass m 0 has 3-momentum 
p. An observer with 3-velocity u looks at the particle: what energy would he 
measure? 

Solution. In the rest frame of the observer, u'° — 1, u ni = 0. Suppose he 
measures the particle's energy to be E'. Suppose also that in his rest frame 
the 4-momentum of the particle is P[ = (E\ —p' ), say. Thus P'u n = E'. 

Now evaluate the same invariant in the given laboratory frame. Then 
the energy of the particle is P° = \J p 2 + m\. The observer has 3-velocity 
u. So his 4-velocity is y( 1, u), where y~ 1 = Vl — u 2 . We therefore have 
P < u ‘ = + ml x y — y p • u. Therefore the energy measured by the 

observer is E' = y(y /pi + — p • u). 


Example 1.7.4 Problem. A particle of rest mass mo moving with velocity 
v collides with a stationery particle of rest mass M and is absorbed by it. 
Given that energy and momentum are conserved in the collision, find the 
rest mass and velocity of the composite particle. 

Solution. The 4-momentum of the moving particle is [m 0 y,m 0 yv, 0,0], 
where the direction of motion is chosen as the x-axis; y = Vl — v 2 . 

The 4-momentum of the stationary particle is [M, 0, 0, 0]. 

Thus the 4-momentum of the composite particle is [M + m 0 y, m 0 yv, 0, 0]. 
The mass of this particle is M 0 , say. Then 

Mq = (M + m 0 y) 2 — mly 2 v 2 = M 2 + 2 m 0 My + m 2 0 y 2 ( 1 — v 2 ) 

— M 2 + 2m 0 My + m 2 0 . 

The velocity is given by nioyv/(M + moy). 


1.8 A uniformly accelerated particle 

Let us calculate the motion of a uniformly accelerated particle. That 
means that the particle has a constant acceleration /o in its rest frame. 
Let us assume that the particle is moving in the x direction and starts 
with rest at t = 0. In the general frame we assume the 4-velocity of the 
particle to be u’ and so its acceleration will be dzd/ds = a 1 , say. Now, 
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from the general result 


rjiku' u k = 1 


we get by differentiation with respect to s the result 

du' 

u,~r = 0. (1.49) 

Hence, in the rest frame the acceleration cannot have any component 
in the time direction, its sole component being /o in the x direction. Thus 
acceleration is a spacelike vector. From the invariance of its magnitude 
in the general frame we will have components of du 1 /As in the x and f 
directions such that 

du 1 du k , 

ntk -r tt = -/o- (i-50) 

ds ds 

The negative sign indicates that the vector is spacelike. We have therefore 


Uj = v x [1, —v, 0, 0]; = [«o, fli, 0, 0]. 

dv 

From Equations (1.49) and (1.50) we therefore have 
do — va i =0, a\ — a\ — — / 0 2 . 

On solving these, we get the result 

d v 

h ~ d(VT^F' 

This equation can be integrated to give 

/of /ill, 

v =——==, ( 1 . 51 ) 

\/l + /o f 2 

where we have used the boundary condition that at t — 0, v — 0. Now, 
assuming that at the initial instant the particle was at the origin, we get 
the integral of this equation as 




For / 0 f 1 we have the non-relativistic approximation giving the 

familiar result from Newtonian dynamics: v(t) = /of, x — fot 2 /2. 

The proper time of the particle is given by 


1 — v 2 df = — sinli '(/of). 
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As t increases, this grows much more slowly than t. We will use this 
result to discuss the celebrated ‘twin paradox’ next. 


1.9 The twin paradox 

In the early days of special relativity the time-dilatation effect of the kind 
described earlier in Section 1.6 was considered puzzling and paradoxical 
largely because one was then accustomed to the Newtonian absolute 
time. The paradox arose because each of the two inertial observers could 
apparently argue that the clock of the other was moving more slowly than 
his. This was clearly logically impossible. However, the resolution of the 
paradox lay in the fact that in each experiment one observer used two 
clocks in his frame to compare with one clock of the other observer 
moving relative to him. Thus there was no symmetry between the two 
observers and there was no contradiction if one found the other’s clock 
running slower. 

A more sophisticated paradox was invented subsequently to counter 
this asymmetry. Known as the twin paradox, it had two twin brothers, 
A and B, say, of which B stays at rest in his inertial frame while A takes 
off in a spacecraft attaining high speed. A goes far and returns after a 
considerable time as measured by B. But, since A was moving relative 
to B with high speed, A would be younger than B ... because A’s watch 
would run slower than B’s. Also, since A and B (unlike the clocks in 
the earlier paradox) meet at the same place at the beginning and end of 
the experiment, the effect must be real. For example, if A accelerates 
and decelerates but has speed v — 4c/5 relative to B most of the time, 
his watch would run at the rate Vl — v 2 compared with B’s watch. That 
is, if 40 years have elapsed as measured by B, only 24 years will have 
elapsed for A. So A will be younger than B by 16 years! 

The paradox is not here. It arises when you argue that, as seen by 
A, his twin B has travelled away with high speed and come back to the 
same spot at the same time. Then, by the same argument B should be 
younger than B by 16 years. So what is the correct situation? 

To resolve the paradox, we note that the situation between A and B 
is not symmetric. B is in an inertial frame whereas A is accelerated and 
decelerated over stretches of his journey. We can take a simple example 
that A is uniformly accelerated over a period of 10 years at the end of 
which he attains a speed of 4c/5. Then he decelerates uniformly and 
comes to a halt after 10 years. He reverses his direction of travel and 
follows the same acceleration/deceleration pattern. Thus he is back with 
B after 40 years have elapsed by B’s watch. How long a time has elapsed 
according to A’s watch? 
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Using our formulae for uniform acceleration, we get the solution as 
follows. First, at the end of t — 10 years, the speed attained is given by 
formula (1.51): 


v _ f 0 x t/c 

c ~ 0 + /OW 


(1.54) 


Now setting u/c = 4/5, we get the solution as fot/c — 4/3. The formula 
(1.53) gives the elapse of proper time of A as 


r = — sinh '(fot/c) = -t sinh ’(4/3). 

Jo ' 4 

The total proper time of A for the entire journey works out at 32.96 
years. Thus A will return to find himself younger than B by about 
7 years. 

At a deeper level, one may still raise the following question: of A 
and B, who is entitled to claim the status of an inertial observer? If there 
is no background to refer to, this may lead to an undecidable proposition. 
We will come back to this issue towards the end of this book when we 
discuss Mach’s principle. 


1.10 Back to electrodynamics 

Einstein was led to special relativity through Maxwell’s electromagnetic 
theory. We close this chapter by highlighting a few issues that illustrate 
the close relationship of electrodynamics, mechanics and special relativ¬ 
ity. We begin with Minkowski’s four-dimensional view. In terms of the 
four-dimensional notation, Maxwell’s equations become more compact. 

Consider a 4-vector A t as the 4-potential whose timelike component 
serves as the electrostatic potential 4> while the spacelike component 
serves as the electromagnetic potential A. The electromagnetic fields 
are then related to the tensor 

A hk -A k .j = F kl . (1.55) 

Thus we have 


[F 01 . F 02 , Fo 3 ] 


as the electric field and 


[—F23, -F13, —F12] 

as the magnetic field. 

The Maxwell equations then acquire an elegant structure: 

Ff - 4nj k . 


Fjk.i + F k u + Fn k = 0 , 


(1.56) 
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where the 4-current vector j k has its zeroth component as the charge 
density and the remaining three components as the 3-current density. 
The subscript comma indicates the derivative with respect to the indexed 
coordinate. 

To supplement these equations, we add the four-dimensional version 
of the Lorentz force equation describing the motion of an electric charge 
in an electromagnetic field: 


d«' 

ds 



(1.57) 


m and e being the mass and charge of the particle. 

We will not go into the details of this topic which is covered in detail 
in graduate-level texts in electrodynamics. We mention one fact, though, 
which will have relevance to our later work in general relativity. This is 
the derivation of the above equations from a single action principle, the 
action being ( with c = 1) 


A = 


16 ?r 


F ik F' k d x - 


eAi dr' — £h m ds. 


(1.58) 


The action is defined over a spacetime region of volume V and the 
particles like m are supposed to move across it along world lines T. The 
variation of the field variables leads to the (Maxwell) field equations 
while the variation of the particle world lines yields their equations of 
motion. Later, in Chapter 7, we will generalize this result. 

Finally, a plane-wave solution of Maxwell’s equation takes the simple 
form 


A m = a m x exp(i£; -x'), (1-59) 

where the null vector kj has angular frequency co as the timelike compo¬ 
nent, the spacelike part being the wavenumber vector k whose magnitude 
k is equal to &>/c and whose direction is that of the propagation of the 
wave. Thus, if the frequency of the wave is v, then co = 2n v and we may 
write in the old three-dimensional notation 

k •x 1 — 2nv(t — r-e/c), 

where e is the unit vector in the direction of propagation of the wave. 

1.10.1 The Doppler effect 

Using the above notation, we can easily derive the formula for the rela¬ 
tivistic Doppler effect, that is, the formula telling us how the frequency 
(and direction of propagation) of a light source in motion relative to an 
inertial observer depends on their relative motion. Using our two inertial 
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observers, we wish to know how the frequency and direction of the wave 
changes between O and O'. 

We refer to Equation (1.59) above and note that the quantity in 
the exponent is the phase of the wave and that this should be invariant 
between two inertial observers. 

Using Equations (1.17) for the Lorentz transformation we get 

v(t — e • r/c) = v'(t ' — e' • r'/c ) 

= v’y[t — v- r/c + e' • (r* — vt)/c]. (1.60) 


By equating coefficients of t and r on both sides we get 
v = v'y( 1 — e' • v/c), 

e'* — v/c (1.61) 

e = -. 

1 — e' - v/c 

Here e'* is given in terms of e' just as r* is given in terms of r by (1.18). 

If we take without loss of generality the space components of these 
vectorsasv = ( v , 0, 0),e = (cosd, sind, 0),e' = (cos 9', sinO', 0),then 
we can reduce Equations (1.61) to 


v = yv' X (1 — ccos O'/c) 


(1.62) 


and 


cos 8 = 


sin 8 = 


cos O' — v/c 
1 — v cos O'/c ’ 
sin 8' 

y( 1 — v cos 8'/c)' 


(1.63) 


In the case of radial motion away from the observer, we get the 
answer as 


v = v 



(1.64) 


That is, the source of light has a reduced apparent frequency as seen by 
the observer. This phenomenon is known as redshift, since in the visible 
spectrum the red colour lies at the lowest-frequency end. 

The formula (1.63) is used to explain the phenomenon of aberration 
as seen in the example that follows. 


Example 1.10.1 The formula (1.63) is useful in the measurement of the 
change in the apparent direction of a light source when viewed from a moving 
frame, with different velocities. The angle 9 in one frame changes to O' as 
seen in the text. The change in the direction is called aberration. 

In the case of the Earth, the direction of its motion changes to the 
opposite after six months (half the orbit). So in (1.63) v changes to —v. 
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leading to change in the direction ( 9 ) of the source. The net shift in direction, 
~ v sin 9/c, is of the order of 10~ 4 , but can be measured. The aberration of 
the star y Draconis was first measured in 1725 by James Bradley and this 
was the first proof of Galileo’s firm belief that the Earth moves. 


1.11 Conclusion 

This brings us to the end of a ‘crash course’ on special relativity. The 
reader may wish to study the subject at greater depth than dealt with 
here and to this end the References [4, 5, 6] may be worth a look. We 
have, however, built a framework from which to launch the study of a 
more elaborate theory that deals with arbitrarily accelerated observers 
and the modification of the Newtonian framework of gravitation. This 
is the general theory of relativity which determines the main interest of 
this book. 


Exercises 

1. Find whether the following vectors are timelike, spacelike or null. 

(i) A 0 — 4, A 1 — 3, A 2 = 2, A 3 = 1. 

(ii) The tangent to the circle x 2 + y 2 = 1, z = 0 , t = 0 . 

(iii) The normal to the hyperboloid x 2 + y 2 + z 2 — c 2 t 2 = 1. 

(iv) The tangent to the curve parametrized by X, where x l — f r sin#dX,x 2 = 
f r cos 9 dX, x 3 = f z dX and x° = f fr 2 + z 1 dX. The r, 9, z are arbitrary 
functions of X. 

2. From the application of a special Lorentz transformation work out how the 
electric and magnetic fields transform in vacuum. 

3. A rod moves with velocity 3c/5 in a straight line relative to an inertial frame 
S. In its rest frame the rod makes an angle of 60° with the forward direction 
of its motion. Show that in the frame S the rod appears to make an angle 
cot _1 (4/(5-\/3)) with the direction of motion. 

4. A mirror is moving with speed v in the x direction with its plane surface 
normal to it. In this frame a photon travelling in the x—y plane is incident on the 
mirror surface at angle 9 to the normal. Show that, as seen from this frame, the 
reflected photon makes an angle 9 with the normal to the mirror, where 

cos 9 + cos a 

cos 9 — - 

1 + cos 9 cos a 


and cos a = 2t>/(l + i> 2 ). 
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5. In a Compton-scattering experiment, a photon was scattered in a direction 
making an angle of 60° with its original direction. Show that the wavelength of 
the photon will have increased by h /(2/«oc). 

6. Show that, for the Maxwell equations, the quantities B 2 — E 2 and B • E are 
invarient under the Lorentz transformation. 

7. Show that if E • B = 0 there is a Lorentz transformation that makes either E 
or B equal to zero. 

8. An electric charge q of mass m at rest moves in a circular orbit around a 
magnetic field B perpendicular to the orbit. The charge takes time 2 jt/o> and the 
radius of the orbit is R. Show that 

^ mat ma> 

q\/ 1 — v 2 qV 1 — cd 2 R 2 

9. A source of light is moving towards an observer with speed v such that its 
direction of motion makes an angle 6 with the line of sight to the source. If there 
is zero Doppler shift, find 6. 

10. From the observation formula derived in the text show that a source viewed 
from the Earth today and six months later will show a shift in direction equal 
to 2t>/c x sin 9, where 9 is the angle the direction to the source makes with the 
Earth’s motion. Estimate the order of magnitude of the effect. 

11. Can an electron by itself absorb or emit a single photon? If the answer is 
‘Yes’, show one example. If the answer is ‘No’, prove it. 

12. A particle of rest mass M 0 disintegrates into three particles of rest masses 
Mi, M 2 and M 3 , respectively. If, in the rest frame of the original particle, the 
particles of rest masses M 2 and M 3 move at right angles and have equal energies, 
calculate the energy of each particle and show that the following inequalities 
must be satisfied: 

Ml < * (M 0 - 2 M 2 f + * (M 0 - 2M 3 ) 2 , 

M l + M l < \( M a - M 0 2 - 

13. Two electrons are approaching each other, each with energy ymc 2 in the 
laboratory frame. What is the energy of one electron as seen in the rest frame of 
the other? 

14. A traveller through interstellar space took off from the Earth at t = 0 with 
constant acceleration / and, after t = t\, continued moving at the acquired 
speed until t = t 2 . Then he decelerated with constant deceleration /, until he 
came to rest at t = t 2 + t\. He then reversed the trajectory to return to Earth at 
t = 2(t\ + t 2 ). What duration for this journey was registered on his own clock? 

15. A speeding spaceship went through a red light at a traffic junction in inter¬ 
stellar space. When stopped by cops, the driver said ‘But I saw only green lights’. 
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Could he be telling the truth? If so, how will you estimate his approximate veloc¬ 
ity? 

16. A train 100 m long is approaching a tunnel 75 m long, at a speed of 0.8c. 
The tunnel keeper has instructions to close both (entrance and exit) ends of the 
tunnel simultaneously when the rear end of the train enters the tunnel. How long 
after the tunnel is sealed does the engine strike the exit door of the tunnel? For 
the engine driver, the tunnel appears shrunk to what length? Can the shrunken 
tunnel acconnnodate the train? How do you resolve the contradiction between the 
tunnel keeper's and the engine driver’s version? (This example usually generates 
considerable discussion.) 

17. An electron at rest is hit by an approaching photon with energy equal to the 
rest energy of the electron. Show that in the centre-of-mass frame of the system 
the electron is moving with speed half that of light. 



Chapter 2 

From the special to the general theory 
of relativity 


2.1 Space, time and gravitation 

The special theory of relativity reviewed in the last chapter marked a 
major advance in physics. The basic assumption that the fundamental 
laws of physics are invariant for all inertial observers looks at first sight a 
reasonable premise. However, as we saw in Chapter 1, its application to 
Maxwell’s equations of electromagnetic theory led to a drastic revision 
of how such observers make and relate their measurements of space and 
time. One consequence was that the Newtonian notions of absolute space 
and absolute time had to be abandoned and replaced by a unified entity 
of spacetime. The Galilean transformation relating the space and time 
measurements of two inertial observers had to be replaced by the Lorentz 
transformation. Strange and non-intuitive though the consequences of 
this transformation were, as we saw in Chapter 1, several experiments 
confirmed them. 

In spite of these successes, Einstein felt that the special theory 
addressed limited issues. For example, what was the nature of physical 
laws when viewed not in the inertial frames of reference, but in an 
accelerated one? Was there some more general principle that, when 
applied to these laws, preserved their form? Intuitively Einstein felt that 
some such situation must prevail. But that required a formalism more 
general than that provided by the Lorentz transformation. 

On another matter, of the two classical theories of physics known in 
the first decade of the twentieth century, the electromagnetic theory had 
played a major role in the genesis of special relativity. The invariance 
of Maxwell’s equations under the Lorentz transformation is linked with 
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the invariance of the speed of light for all inertial observers. The light 
speed thus has a special status in spacetime and causality is preserved by 
demanding that all physical interactions travel with speeds not exceed¬ 
ing this speed. Nevertheless, this fiat was broken by the Newtonian 
law of gravitation, the other known basic interaction. The gravitational 
attraction seemed to be instantaneous across space: that is, it seemed 
to propagate with infinite speed. Just as Newtonian dynamics had to be 
adapted to suit the new rules of spacetime measurements, the Newtonian 
law of gravitation also required a suitable adaptation. 

Hermann Bondi had highlighted the conflict between the law of 
gravitation and special relativity by the following example. Imagine that 
by ‘some magic’ the Sun is removed from its place. How and when will 
we on the Earth come to know of the event? Because sunlight takes about 
500 seconds to reach us, we would come to know of the Sun’s absence 
after that period. However, the disappearance of the Sun’s gravity will 
be ‘instantaneous’ and the Earth would move off its usual trajectory 
‘immediately’. Thus gravitational interaction would tell us of the event 
500 seconds before light would do so. This thought experiment, though 
physically impossible, illustrates the point. 

One may think that it is relatively simple to adapt Newtonian gravi¬ 
tation to suit special relativity. But the reality is different! For example, 
consider the Laplace equation 

V 2 <p = 4nGp, (2.1) 

where cp is the gravitational potential and p is the mass density. If one 
wishes to make the interaction travel with the speed of light, one may 
change the ‘V 2 ’ operator on the left-hand side to the wave operator 
This can be easily done. But look at the right-hand side. Since special 
relativity teaches us that E — Me 2 , we need to include in p all the energy 
densities also. Now, Newtonian theory tells us that gravitation itself has 
energy, and so we need to include it on the right-hand side. The energy 
density of gravitation has a form something like 

^ = -[(V0) 2 -0 2 ]/(87rG). (2.2) 

In other words the modified equation (2.1) has a (V0 ^-dependent term 
on the right-hand side. Thus the problem has become non-linear. One 
may further ask whether the self-action from </> will not change its value 
further. Indeed it will! In fact Equation (2.1) becomes even more com¬ 
plicated. We will not get further into this issue here since our purpose, to 
demonstrate that the original Newtonian law becomes non-linear if we 
try to make it consistent with special relativity, has already been served. 

While this example illustrates that the Newtonian law needs to be 
modified in the presence of special relativity, the reverse is also true. 
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We need to modify special relativity in the presence of gravity. For we 
have looked at how special relativity functions in the framework of 
inertial observers. As defined in Chapter 1, these observers are moving 
under no force. Can one find such observers in reality? The answer is 
‘Not in the presence of gravitation!’. For it is not possible to isolate 
a finite-sized region inside which there is no gravitational force for a 
finite time. What about astronauts in spaceships apparently floating in 
a gravity-free region? Even there gravity is not altogether absent: for, 
wherever there is matter, there is a gravitational effect, howsoever small 
it may be. So we need a theory that allows for the ever-present gravity. 
We will return to this issue shortly. The theory that Einstein came up 
with to address this issue, the general theory of relativity , represents 
perhaps the most remarkable flight of imagination in science. 

Every major scientific theory carries its own mark of distinction. The 
distinctive feature of Newtonian gravitation is the radial inverse-square 
law. To those uninitiated in the laws of dynamics, the fact that a planet 
goes around the Sun under a force of attraction towards the Sun comes 
as a surprise. Yet this is a natural consequence of the inverse-square 
law. The major achievement of Maxwell’s electromagnetic theory was 
the unification of electricity and magnetism and the demonstration that 
light itself is an electromagnetic wave. The unique place held by the 
speed of light characterizes Einstein’s special theory of relativity, while 
quantum mechanics can point to the uncertainty principle as the crucial 
feature that sets it apart from classical mechanics. 

To what distinctive feature can general relativity lay its own special 
claim? A clue to the answer to this question is provided in the title of 
this section. 

Let us compare gravitation with electricity. We know that two unlike 
electric charges attract each other through the Coulomb inverse-square 
law, just as any two masses attract each other gravitationally by the 
Newtonian inverse-square law. To this extent, electricity and gravitation 
are similar. Flowever, we can go no further! We also know that two like 
electric charges repel each other and that this property seems to have no 
parallel in gravitation. Every bit of matter attracts every other bit and, 
as yet, we do not have any instance of gravitational repulsion. 

We can express this difference between electricity and gravitation 
in another, more practical way. The existence of repulsion as well as 
attraction with positive and negative charges enables us to construct a 
closed chamber whose interior is completely sealed from any outside 
electrical or magnetic influence. Not so with gravitation! We cannot 
point to any region of space as being totally free of external gravita¬ 
tional influences. Gravitation is permanent: it cannot be switched off 
at will. 
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This ever-present nature of gravitation plays a key role in Einstein’s 
general theory of relativity. Einstein argued that, because of its perma¬ 
nence, gravitation must be related to some intrinsic feature of space and 
time (which are also permanent!). With a master stroke of genius, he 
identified this feature as the geometry’ of space and time. He suggested 
that any effects we ascribe to gravitation actually arise because the 
geometry of space and time is ‘unusual’. Let us now try to understand 
what is meant by the word ‘unusual’ and how this property of space 
and time leads to gravitational effects - for therein lies the distinctive 
characteristic that sets general relativity apart from other physical 
theories. 

2.2 Non-Euclidean geometries 

The ‘usual’ geometry of space, the geometry that we learn at school 
and apply in so many ways, is the geometry whose foundations were 
laid by the Greek mathematician Euclid c. 300 BC. Euclidean geometry 
is a logical structure wherein theorems about triangles, parallelograms, 
circles and so on are proved on the basis of postulates that are taken as 
self-evident. Thus the results shown in Figure 2.1 follow as theorems in 
Euclid’s geometry, being based on the original postulates of Euclid, and 
the validity of these results appears to be borne out by measurements of 
lengths and angles in physical space. 

Postulates are assumptions that are regarded as self-evident and 
are not expected to be ‘proved’. One such postulate is illustrated in 
Figure 2.2. Here we have a straight line l with a point P outside it. How 
many straight lines can we draw through P parallel to 17 Our experience 
suggests that the answer is ‘only one’. But can this expectation be further 
proved? In Euclid’s geometry this is taken as a postulate (sometimes 
known as the parallel postulate) and many of its theorems are based on 
it. One such theorem is that the three angles of a triangle add up to two 
right angles. 

It was only in the last century that mathematicians realized that there 
is nothing sacrosanct about Euclid’s postulates. Provided that they are not 
mutually contradictory, any new set of postulates can lead to a new and 
consistent type of geometry. Indeed, as the work of such mathematicians 
as Gauss (1777-1855), Bolyai (1802-1860), Lobatchevsky (1793-1856) 
and Riemann (1826-1866) showed, a host of such new geometries can 
be constructed. These are collectively called non-Euclidean geometries. 
For instance, the parallel postulate can be changed: one may assume 
that there is no straight line through P parallel to /, or one may assume 
that several lines can be drawn through P parallel to l. Geometries using 
these revised postulates will be non-Euclidean. 


A 



Fig. 2.1 . Some familiar results 
of Euclid's geometry: above, 
the three angles of a trian¬ 
gle add up to 180°; below, 
Pythagoras' theorem that the 
square of the hypotenuse of 
a right-angled triangle equals 
the sum of the squares of its 
other two sides. 



7 

Fig. 2.2. The parallel postulate 
of Euclid, described in the text. 
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Fig. 2.3. The two theorems 
described in Figure 2.1 
change for the geometry on 
the surface of a sphere as 
shown here. 



9 = 0 



AB 2 + BC 2 > AC 
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In this sense, the geometry on the surface of a sphere is non- 
Euclidean. If we define a straight line on the surface of a sphere as 
the line of shortest distance between two points, it is easy to see that 
these lines are arcs of great circles. Because any two great circles inter¬ 
sect, there are no parallel lines in this geometry: no line can be drawn 
through P parallel to /. Figure 2.3 demonstrates how the theorem about 
the sum of the three angles of a triangle breaks down in this case. 

We may also mention in passing that Euclidean straight lines such as 
latitude lines drawn on a flat map of the Earth do not represent paths of 
shortest distance. Rather such paths drawn on flat maps look curved. Air¬ 
craft pilots choosing to fly along the shortest routes follow these paths. 

We will now explore the possibility that Einstein advocated, namely 
that the effect of gravity can be described not through the conventional 
Newtonian interpretation of a force but by ascribing a non-Euclidean 
character to the geometry of spacetime. 

2.3 Gravity, geometry and dynamics 

The Einsteinian concept of non-Euclidean geometry of space extended 
to space and time or spacetime can be illustrated by an example in 
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dynamics. Figure 2.4 shows the spacetime trajectory followed by a stone 
thrown vertically upwards with an initial velocity v. Counting time t from 
the instant of throw, the height y of the stone at any instant t is given by 

y = vt- ^gt \ (2.3) 

where g is the acceleration due to gravity. This is a Newtonian equation 
and may be interpreted as follows. If there were no gravity, the stone 
would have continued to move in a straight line with uniform speed 
v directed upwards, as indicated by the first term only on the right- 
hand side of the above equation. In Figure 2.4 this trajectory is shown 
by a dotted straight line. By contrast the actual trajectory is shown by 
the parabolic continuous line. In the Newtonian framework, the dotted 
trajectory illustrates the first law of motion, whereas the continuous one 
arises from the second law of motion because of the application of the 
force of gravity. 

According to Einstein, the interpretation of this phenomenon would 
be as follows. First we have to recognize that the gravity of the Earth 
has a permanent existence. It cannot be ‘turned on’ or ‘turned off’. So 
we should not talk of the dotted trajectory, dealing as it does with a 
possibility that cannot happen. The continuous trajectory is the only tra¬ 
jectory that we have to interpret. Only we now assume that the geometry 
of spacetime in which the stone is moving is non-Euclidean, being ren¬ 
dered so by the presence of Earth’s gravity. So the apparently parabolic¬ 
looking trajectory is actually describing ‘straight line motion with a 
uniform velocity’ but in a non-Euclidean spacetime. 

At this juncture a sceptic may argue that the continuous line of 
Figure 2.4 clearly demonstrates a curved trajectory with a varying veloc¬ 
ity. Flow can we call it a straight-line motion with uniform speed? The 
answer is that, if the rules of geometry are changed, so are the defini¬ 
tions of straight lines and measurements of velocities. In Figure 2.3, for 
example, the apparently curved arcs on the sphere are in fact straight 
lines in the spherical geometry. Similarly, in this case, the geometry of 
spacetime is changed in such a way that the trajectory that is curved 
in Euclidean geometry becomes straight! Recall the example of the flat 
map of the Earth in the previous section. Similarly, the constancy of 
velocity is to be determined not by the rules of Euclidean measurements 
but by the prevailing non-Euclidean geometry. 

To summarize, therefore, Einstein’s way of describing gravity is to 
do away with the notion that it is a force. Rather the trick is to find 
a suitably non-Euclidean spacetime geometry in which matter under 
no force moves in ‘straight-line trajectories with uniform speed’ as 
measured in terms of the rules of the new geometry. 



Fig. 2.4. As explained in 
the text, in the Newtonian 
dynamics the dotted trajectory 
describes straight-line motion 
without acceleration. The 
curved continuous line shows 
the trajectory wherein accel¬ 
eration is induced by gravity. 
According to Einstein only this 
trajectory has real status and it 
describes uniform motion in a 
straight line in the spacetime 
'curved' by gravity. 
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We need a mathematical apparatus to describe these ideas precisely. 
Also, if spacetime is ‘curved’ because of its non-Euclidean geometry 
we need suitable machinery to describe physics in it. For, as we just saw, 
even the simple concept of velocity requires re-definition. So we will 
begin our discussion of general relativity by setting up the framework 
needed for these purposes. Chapter 3 accordingly starts the process. 

Exercises 

1. An optical photon has wavelength 500 nm. Estimate its gravitational mass. 

2. Two airports exist on the same latitude 9 but on longitudes differing by 180°. 
Show that an aircraft connecting them flying along the latitude 6 will exceed the 
shortest path by 2 R[9 — 7rsin 2 d/2], R is the radius of the Earth. 

3. A triangle on the spherical Earth has its vertices at specified latitude (/) and 
longitude ( L) as follows: vertex A at / = 45°, L = 180°; vertex B at / = 0°, 
L = 120° and vertex C at / = 0°, L = 240°. By what amount does the sum of 
the three angles of the triangle ABC exceed 180°? 

4. Assuming that the Newtonian potential at an external point at distance R 
from the centre of a spherical mass distribution is GM/ R , estimate a correction 
to it to first order using formula (2.2) for gravitational energy. 

5. Consider a sphere of uniform density p and radius R. Calculate the Newtonian 
gravitational potential <j> for this sphere. Compare the gravitational energy source 
(V0) 2 /(87rG) with p. 

6. A projectile is ejected with vertical speed v from the surface of a planet of 
mass M and radius R. Show that it comes to rest at a distance r from the centre 
of the planet, where 


R 



2 GM 


provided that the denominator is positive. Put v = c in the above example and 
deduce the condition between M and R such that the planet acts like a Newtonian 
black hole. 



Chapter 3 

Vectors and tensors 


3.1 The spacetime metric 

The classical definition of ‘geometry’ is that it is a science of measure¬ 
ments of distances and angles. Let us first recall a familiar result from 
special relativity in the following form. Let ( x , y, z) denote a Cartesian 
coordinate system and t the time measured by an observer O at rest in an 
inertial frame, that is by an observer who is acted on by no force. Let two 
neighbouring events in space and time be labelled by the coordinates 
( x , y, z, t) and (x + dx, y + d y, z + dz, t + dt). The resulting analogue 
of the Pythagorean theorem is as follows. The square of the ‘distance’ 
between the two events is given by 

ds 2 = c 2 dr — dv 2 — dy 2 — dz 2 . (3.1) 

The distance ds is invariant under Lorentz transformation in the sense 
that another inertial observer O' using a different coordinate system 
(x', /, z', t ') to measure this distance will find the same answer. 

However, when we make a transition from special to general rela¬ 
tivity and quantify Einstein’s idea that the geometry of space and time 
is unusual in the presence of gravitation, we abandon the simple form 
of (3.1) in favour of a more complicated form. The more complicated 
form is still quadratic, and we may state it formally as follows: 

3 

ds’ 2 = g ik dx' dx k . (3.2) 

i,k =o 

Here we have modified the notation as follows. The coordinates are 
now called x l , with i — 1, 2, 3 representing the three space coordinates 


41 



42 Vectors and tensors 


and i = 0 the time coordinate. The coefficients ga are functions of x' 
with the property that the matrix ||g,yt|| has the signature — 2. 1 

The expressions for d.v 2 are often referred to as the Tine element’ 
or the ‘metric’. Thus we have the general expectation that the spacetime 
metric has signature —2. We shall denote the determinant of the matrix 
Hgiill by g and its inverse matrix by ||g'*||. It is easy to verify that g is 
negative. 

Clearly, the geometry of spacetime in which the basic invariant dis¬ 
tance is given by (3.2) instead of by (3.1) is going to be more complicated 
to describe. Its properties will depend on the functions ga. But do these 
complications arise simply because of a choice of coordinates, or do 
they indicate a spacetime with a geometry genuinely different from that 
used in special relativity? 


Example 3.1.1 Consider for example the form 

dv 2 = c 2 df 2 - [dr 2 + r 2 (d0 2 + sin 2 6> d0 2 )], 

which looks more complicated than (3.1). Does it describe some new geom¬ 
etry? A little investigation will show that it is obtainable from (3.1) by the 
coordinate transformation 

x = r cos 9, y = r sin 9 cos (j> , z = r sin 9 sin cj>. 

We do not expect that a fundamental change in the properties of space- 
time, such as its geometry, should be brought about by such a change of 
coordinates. However, consider another example. 

Let us take the geometry on the surface of a sphere E of radius a. If 
we consider the sphere as embedded in a three-dimensional space with the 
Cartesian coordinates x, y, z, we may write the equation of the surface of 
the sphere as 

x 2 + y 2 + z 2 = a 2 . 

However, can we study the geometry on this surface without recourse 
to the embedding space? For describing the geometry on the surface of 
the sphere it is more convenient to use coordinates intrinsic to the sur¬ 
face of the sphere. Such coordinates are available and are like the latitude 
and longitude used by geographers to locate a point on the Earth. More 
specifically, 

x = a cos 9 , y = a sin 9 cos cj>, z = a sin 9 sin cj>. 


1 This means that, if at any spacetime point the quadratic form (3.2) is diagonalized, it 
has one square term with a positive coefficient and three square terms with negative 
coefficients. The signature equals the number of positive terms minus the number of 
negative terms. 
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so that for any (9, p) with 0 < 9 <ji and 0 < p < 2n we can locate a point 
(x, y, z) on the surface of the sphere. Spherical trigonometry tells us how 
to measure and relate the angles, sides and so on of triangles drawn on this 
surface. The rules of Euclid’s geometry do not apply to these measurements. 


In our above example, the square of the distance between two neigh¬ 
bouring points (9, p) and (9 + dd, p + d p) is given by 

da 2 = [d.v 2 + dv 2 + dz 2 ] E = a 2 (d9 2 + sin 2 0 dp 2 ). (3.3) 

Thus we have here another example of ga that are not all constants. In 
this as well as the previous example, this property is shared. However, 
in the earlier case the geometry was Euclidean, whereas here it is not. 
Simply having a coordinate-dependent g,* does not therefore convey the 
physical reality. So the mathematical formalism that we build up should 
be such that it can distinguish between real effects and coordinate effects. 
In a qualitative way we can see that the essential information must survive 
even when we change from one coordinate system to another. In order 
to extract such information, we must devise machinery that tells us 
what things remain unchanged under coordinate transformations. Such 
machinery is provided by the invariants, the vectors and the tensors, 
which we shall now study. 


3.2 Scalars and vectors 

Let us first introduce the summation convention which was already used 
in a limited way in Section 1.5.1. We will frequently encounter sums 
like 

3 3 3 

AikBk - (3 - 4) 
i=0 k=0 i,k =0 

It is convenient in such cases to drop the summation symbol and 
write these quantities as 

A,B\ A, k B k , P ik ^f k . (3.5) 

the rule being that, whenever an index appears once as a subscript and 
once as a superscript in the same expression, it is automatically summed 
over all the values (from 0 to 3). Thus we can rewrite (3.2) in the more 
compact form 
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A note of caution is needed here: the summation convention does not 
apply under any other circumstances. Thus it does not apply to quantities 
like 


Ai Bj, A11- B, Ci ,..., 


(3.7) 


wherein repeated indices do not follow the rule of appearing only twice, 
once up and once down. However, such expressions fortunately do not 
arise in most relativistic calculations. Indeed, the appearance of such 
‘monster’ expressions is a warning that we have made a mistake in 
our index manipulation. At this stage the appearance of subscripts and 
superscripts may seem somewhat arbitrary. We ask the reader to be 
patient: they will be properly introduced into the formalism very shortly. 

We will assume that the Latin indices i, j, k, . . . will run over all 
four values 0, 1, 2, 3. On some (rather infrequent) occasions we may 
want to refer to index values 1,2, 3 only, which are usually reserved for 
space components, and we will use Greek indices fi,v,.. .to represent 
these. Thus will equal A\B l + A 2 B 2 + A 3 B 3 . 

It is worth pointing out here that many other textbooks use the con¬ 
vention of denoting the spacetime coordinates by Greek indices X, n,v, 
etc. and the space coordinates by Latin indices i, j, k, etc. Also many 
authors prefer to write (3.1) with the opposite sign for the right-hand side. 
In that case the signature of the metric is +2. Likewise, in some texts, 
time is treated as coordinate number 4 instead of 0, as it is here. These 
differences are of a ‘cosmetic’ nature and do not affect the ‘physics’ 
being described. We caution the reader to check these differences before 
comparing expressions from different sources. 

We now consider a simple example in two-dimensional Euclidean 
space, i.e., in a space where Euclid’s geometry holds. Let, as in Figure 3.1, 


Fig. 3.1. A change of 
Cartesian coordinates changes 
the components of a vector, 
although the vector remains 
unchanged. Here we see the 
effect of rotation of axes 
around the origin. 



x i 


O 
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OX\ and OX 2 denote two Cartesian coordinate axes corresponding to 
coordinates X\ and X2, respectively. Suppose we have a vector A with two 
components A 1 and Ai in these directions. We now change coordinates 
by rotating the axes by an angle a. The new coordinates x\ and x' 2 are 
given in terms of the old ones by the formulae 

x[ — X\ cos a + x 2 sin a, x ' 2 = x 2 cos a — X\ sin a. (3.8) 

Notice that under this transformation the components of A also 
transform in a similar fashion: 

A\ = A 1 cos a + A 2 sin a , A' 2 = A 2 cos a — A\ sin cr. (3.9) 

Now in the usual definition we associate a vector with a magnitude 
and a direction. The above equations keep track of the direction of the 
vector, ensuring that two different observers, one using the unprimed 
coordinates and the other using the primed ones, are talking about the 
same entity even though they are measuring different components rela¬ 
tive to their axes. They also agree on the magnitude of the vector, since 
it is easy to verify that 

A\ + Al = A? + A%. (3.10) 

In short, these transformation laws preserve the physical essentials of 
a vector, namely its magnitude and direction. We will be guided by 
this simple example in generalizing the concept of a vector under any 
coordinate transformation. 


3.2.1 Scalars 

We now introduce spacetime as a manifold M of 3 + 1 dimensions in 
which a typical point P is specified by four coordinates x l . We shall in 
general talk about geometrical entities of M or of physical quantities 
defined in M, which are continuous and differentiable (at least twice) 
with respect to x‘. It is within this manifold that we now proceed to 
describe our geometry-physics relationship. We begin with the simplest 
physical notion. 

A scalar or an invariant does not change under any change of 
coordinates. Thus if </> (x ‘) is a function of coordinates, then it is invariant 
provided that it retains its value under a transformation from x' to new 
coordinates x": 

*(*') = 0[ *'(**)]= *'(**). (3-11) 

Note that the form of the function may change, but its value does 
not. Note further that the infinitesimal square of distance (3.6) is a 
scalar quantity. In our example of vectors in two dimensions, we had 
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Fig. 3.2. The tangent to 
the curve at P acts like a 
contravariant vector. 


encountered the property that the magnitude of a vector does not change 
under the coordinate transformation representing rotation of axes. It is 
therefore a scalar. 

3 . 2.2 Contravariant vectors 

Suppose we are given a curve F in space and time, which is parametrized 
by X. (See Figure 3.2.) Thus, the points along the curve have coordinates 

x 1 ' = x‘(X), (3.12) 


where x‘ are given functions of X. The direction of the tangent to T at 
any point P on it is given by a vector with four components, 


A‘ = 


dx‘ 

~dX' 


(3.13) 


Notice that the direction of a tangent to the curve is an invariant concept: 
a change of coordinates should not alter this concept, although its four 
components in the new coordinates will be different. Suppose the new 
coordinates are x" and the new components are A". Then 


A 


dx"' 

dA. 


(3.14) 


As stated earlier, we will assume that the transformation functions 


x i =x i (x' k ), x' k =x' k (x ‘) (3.15) 


are continuous and possess at least second derivatives. It is then easy to 
see that A" and A 1 are related by the linear transformation 

f> Y lk 

A' k = — A\ (3.16) 

We use (3.16) as the transformation law for any vector A'. Quantities 
in general that transform according to the above linear law are called 
contravariant vectors. The four components of a contravariant vector 
are specified by a superscript. 


Example 3.2.1 Consider the curve parametrized by 

x° = constant, x 1 = constant, x 2 = X, x 3 = X 2 . 

The tangent to this curve is specified by the contravariant vector A 1 with 
components 

A° = 0, A 1 = 0, A 2 = 1, A 3 = 2X. 
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A comparison with the two-dimensional Cartesian example will 
show that the transformation law (3.16) used above is a generalization 
of the law (3.9) used there. In that simple example the coordinate trans¬ 
formation was linear and so the coefficients dx"/dx k and dx k /dx" were 
constants, cos a, sin a, etc. 


3.2.3 Covariant vectors 

Consider next a scalar function 4>(x k ). The equation 

<p(x k ) = constant (3.17) 


describes a hypersurface (that is, a surface of three dimensions) E, 
whose normal at a typical point Q has the direction given by the four 
quantities 



(3.18) 


(See Figure 3.3.) Again, the concept of a normal to a hypersurface 
should be independent of the coordinates used. Under the coordinate 
transformation (3.15), the new components are 


5 ; = ^. 

1 dx" 

It is easy to see that B[ -o- B , is a linear transformation: 


dx 1 

Bk= dx^ B " 


(3.19) 


Again, we generalize (3.19) as a transformation law of any vector 
Bj. Quantities that transform according to this rule are called covariant 
vectors. 


Example 3.2.2 The normal to the unit sphere given by 
</> = (x 1 ) 2 + (x 2 ) 2 + (x 3 ) 2 = 1 
has the covariant components 

B 0 = 0, B t = 2x\ B 2 = 2x 2 , B 3 = 2x 3 . 


Consider oblique coordinate axes OX\ and OX 2 inclined at an acute 
angle fi in a two-dimensional Euclidean plane. Figure 3.4 illustrates the 
situation. Let a vector A be shown by arrow OP. Flow do we specify 
the two components of this vector? There are two obvious ways. One 
is to draw straight lines through P parallel to the axes intersecting them 
at Ri and R 2 , respectively. The lengths ORi and OR 2 then specify 


B. 

I 


i t 



Fig. 3.3. The normal to the 
hypersurface at Q acts like a 
covariant vector. 
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Fig. 3.4. If the axes are not 
rectangular, even in Euclid's 
geometry the covariant and 
contravariant components of 
a vector are different, as seen 
here. (See the text for details.) 



the contravariant components of the vector. For these components are 
in the directions tangential to the coordinate lines X 2 = constant and 
x\ — constant. The second way is to drop perpendiculars PSi and PS 2 
from P on to the two coordinate axes. The lengths OSi and OS 2 then 
represent the covariant components of the vector, since they are given by 
the intercepts of normals to the coordinate axes. As will be appreciated, 
the two ways of describing the vector coincide if we choose rectangular 
Cartesian coordinates. 

We will return to this example later. 


3.3 Tensors 

The concept of a vector can be generalized to that of a tensor. Imagine 
a product of two contravariant vectors A 1 and B k . The 4x4 quantities 
A' B k describe a tensor. Since we know from (3.16) how A 1 and B k trans¬ 
form, we can work out how their product transforms, and apply the rule 
to a general tensor with two contravariant indices. Thus a contravariant 
tensor of rank 2 is characterized by the following transformation law: 

T' ik =- T mn . (3.20) 

dx m dx" 

A covariant tensor of rank 2 is likewise characterized by the transforma¬ 
tion law 


T' = 


dx m dx n 
dx'‘ dx' k 


(3.21) 
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It is also possible to have mixed tensors. Thus T\ is a mixed tensor 
of rank 2, with one contravariant index and one covariant index. It 
transforms as 


T 


ti 

k 


dx" 

dx m 


dx" 

dx' k 


(3.22) 


Again, these concepts are easily generalized to tensors of rank higher 
than 2. The rule is to introduce a transformation factor dx"/dx m for each 
contravariant index i and a factor dx ll /dx lk for each covariant index k. 
In general, a mixed tensor of rank >■ = p + q may have p contravariant 
indices and q covariant indices. 

Trivially, we may consider a scalar as a tensor of rank 0 and a vector 
as a tensor of rank 1. 


Example 3.3.1 The quantities g ik transform as a covariant tensor. This 
result follows from the assumption that ds 2 as given by (3.6) is invariant. 
For 


dx 2 = g ik dx' dx* 


= gik 




f dx ' 3.x* \ 

y k dx' m dx'" J 


dx"” dx"’ 


= g' dx"” dx'”; 

o mn ’ 


that is, 


dx' dx k 
dx' m 3.x'" g ,k 


Example. The Kronecker delta defined by 


S‘ k =1 if i = k, otherwise S' k = 0 


is a mixed tensor of rank 2. This can be easily proved using (3.22) and the 
identity 

3x' ' '' 

Example. Define ||g'*|| to be the inverse matrix of ||g,- t ||, assuming that g is 
the determinant of |g,dl # 0. Thus we have 

gtkg kl = S' 

We now show that g ,k transforms as a contravariant tensor of rank 

2. We use the result just derived namely that g ik transforms as a covariant 

tensor so that 

, _ 3x m dx” 

8ik= dT 7 Sx 7 * " 8mn ' 
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Fig. 3.5. An example of a 
second-rank tensor. In the 
example illustrated, the stress 
tensor at P relates F„, the stress 
force at P, to the direction 
of the local normal to the 
surface at P. In general the two 
directions are not parallel. 


Now define 


and consider the product 


dx p dxi 


dx m dx" dx' k dx" 


g ' ikB ' ~ dx'< dx' k dxp dx « sF “ 8mn 

= g ^8" — — 

8m " 8 p dx' 1 dx< 


— £mp£ 


= 8 ■ 


dx m dx" 
dx'' dxi 


In other words B' kl is the inverse of the matrix g' jk . This proves the result. 

Example. A physical example for tensors is found in the three-dimensional 
space, when discussing deformation of substances. Figure 3.5 illustrates the 
surface E of such a substance, which has normal at a typical point P. If 
the surface is subjected to stress, the resulting force on an element of surface 
around P will be F v , different in direction from the normal n^, but related to 
it by the linear tensor relation 

F = T n 

where T jiV is the stress tensor. If the stress is isotropic, then 

Efiv = pd[LVl 

where p is the pressure which produces a force normal to the 
surface E. (Notice that we have not used the upper/lower indices since 
we are discussing Cartesian tensors in three dimensions.) 

Example. In dynamics we encounter the moment-of-inertia tensor 7 /tv of a 
massive extended body, which is a second-rank tensor in three-dimensional 
space. If co^ is the angular velocity of the body then its angular momentum 
is given by the vector /^ay,. 


3.3.1 Contraction 

The operation of contraction consists of identifying a lower index with 
an upper index in a mixed tensor. This procedure reduces the rank of the 
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tensor by 2, since the repeated index implies a sum over all of its four 
values. 

Thus A' Bk is a tensor of rank 2 if A' and B/ c are vectors. The 
identification i — k gives a scalar. 

A' B/ = A°B 0 + A 1 Bi + A 2 B 2 + A 3 B 3 . 

As in special relativity, we define a vector A' to be spacelike, timelike 
or null according to 

gikA‘A k < 0, g ik A‘A k > 0, or g ik A‘A k = 0. 

It is convenient to define associated tensors by the relations 

A,=g ik A k , A k = g ik A i . (3.23) 

Thus ga-A' A k — A k A k . The operations embodied in (3.23) are called 
lowering and raising the indices. We may frequently refer to A' and A t 
as the same object. 


Example 3.3.2 Let us go back to the example of oblique axes on a plane 
given on page 48. We now use the coordinates as x l and x 2 to keep within 
our general convention. We have the line element given by 

eh 2 = (dv 1 ) 2 + (dx 2 ) 2 + 2 cos P dx 1 dr 2 . 

(Although, being spacelike distances, all these terms should be negative, we 
have omitted the negative signs everywhere to simplify the discussion, which 
in any case is not affected by this change of sign. ) From the above we have 
the following components of the metric tensor and its inverse: 

gll = 1, gl2 = COS P, g22 = 1; 

g 11 = cosec 2 )) = g 22 , g 12 = —cosec 2 fi cos /5. 

From this it is easy to see that, if the contravariant components of a vector 
are A 1 and A 2 , respectively, then the covariant components are 

A\ = A 1 + A 2 cos p, A 2 = A 2 + A 1 cos p. 

These are the same components as those we had derived from geometrical 
considerations. 
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Example 3.3.3 Problem. A curve is specified by the following coordinates 
in terms of parameter t : 


x° = \fhct. 


x = ct, 


x — cto cos ( — 
. ( o 


x — cto sin [ — 
. ( o 


Determine whether the tangent vector at a typical point is spacelike, timelike 
or null. The spacetime is Minkowskian. 


Solution. The tangent vector has components 

A 0 — Vic, A 1 = c, A 2 = —c sin ^ —, A 3 = c cos ^ —• 
Therefore 

A*A, = (A 0 ) 2 - (A 1 ) 2 - (A 2 ) 2 - (A 3 ) 2 

= 3 c 2 — c 2 — c 2 sin 2 — ° 2 c °s 2 

= c 2 > 0. 



So the tangent vector is timelike. 


3.3.2 The quotient law 

From the above manipulations of tensors it is clear (and can easily be 
proved) that the product of two tensors is a tensor. A reverse result is 
sometimes useful in deducing that a certain quantity is a tensor. This 
result is known as the quotient law. It states that, if a relation such as 

PQ — R (3.24) 

holds in all coordinate frames, where P is an arbitrary tensor of rank m 
and R a tensor of rank m + n , then Q is a tensor of rank n. The reader 
may try to formulate a proof of this statement. 

3.3.3 Symmetric and antisymmetric tensors 

If tensors and A ik satisfy the relations 

Sa = S u , A ik = -A ki , (3.25) 

then they are respectively symmetric and antisymmetric tensors of rank 
2. These ideas can be generalized to higher-rank tensors, and we will 
encounter specific tensors having the properties of symmetry and anti¬ 
symmetry with respect to some or all indices. 
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Example 3.3.4 g ik and g' k are symmetric tensors. 


Example. Consider the symbol ey« with the following properties: 


Cjjkl = +1 if (ijkl) is an even permutation of (0123), 
e_ij U = — 1 if (ijkl) is an odd permutation of (0123), 
€ijU = 0 otherwise. 


We will now show that 

Pijkl = \/ g^ijkl 

transforms as a tensor. 

First take the determinant of the transformation law of g m „ given in 
Example 3.3.1. Let J denote the Jacobian |9x'/9x"”|- Then, using the rule 
that the determinant of a product of matrices is equal to the product of their 
determinants, we get 

g' = f~g- 


However, we have from the algebraic definition of a determinant 

9x‘ dx j dx k dx l 
e m „ pq J = e ijk , — — — 

Just write out the full expansion of J as a sum of products of its elements and 
this result will be clear! Using the above two relations, the result follows: 
eiju is a tensor that is totally antisymmetric. Strictly speaking, however, e ijkl 
is a pseudotensor, since it changes sign under transformations involving 
reflection, such as x'° = — x°, x' 1 = x l , x' 2 = x 2 and x' 3 = x 3 . 


3.3.4 Totally symmetric and antisymmetric tensors 

Consider a tensor T ili2 i n of rank n. From this we construct a tensor 


1 

S *X -‘n = —j T PilPi2-Pi»’ 


(3.26) 


where the sum is over all permutations P of (1, 2,..., n). Evidently, if 
we permute the indices of S in any way, its value does not change. Such 
a tensor is called a totally symmetric tensor. 

Likewise, if we write (— l) p = +1 for an even permutation and 
(— l) p = — 1 for an odd permutation, then the sum 


d if 


5j-l) P 7> il ...p,, 


n\ 


(3.27) 
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gives us a totally antisymmetric tensor. An odd permutation of 
0'i, .. i n ) changes the sign of A,- I _ in , whereas an even permutation 
does not. 

If is any tensor we symmetrize it by writing 

A(ik ) = -(Ai k + Au). (3.28) 

Similarly, we antisymmetrize it by writing 

A[ik] = ~(Aik — A ki ). (3.29) 

We can easily extend these concepts to tensors of higher rank as indicated 
in (3.27) and (3.26). Note the convention of writing (ik) for symmetrizing 
with respect to indices (i, k) and [/'A'] for antisymmetrizing. 


Example 3.3.5 Problem. Using g ik twice, construct totally symmetric and 
totally antisymmetric tensors of rank 4. 

Solution. Write T iUm = g ik g /m . 

Even permutations of iklm are 

iklm, ilmk, imkl, Ikmi, likm, Imik, kmli, klim, kiml, mkil, mlki, milk. 

Odd permutations of iklm are likewise 

him, limk, mikl, klmi, ilkm, mlik, mkli, Ihm, ikml, kmil, hnh, imik. 

Thus the totally symmetric tensor is 

1 

Siklm 2 ^ \gikglm 4“ gilgmk 4“ gimgkl 4“ III kg mi 4“ gligkm 4“ glm gik 

4“ gkmgli 4“ gklgim 4“ gkigml 4“ gmkgil 4“ gmlgki 4“ gnu gik J 

1 

+ gkiglm 4- gligmk + gmigkl + gktgmi + gilgkm + gmlgik 

+ gmkgli 4“ glkgim 4“ gikgml 4“ gkmgil + glmgki 4“ gimglk ]• 
Using the symmetry g ik = g ki , we can simplify S ik i m to 

1 

Siklm ^ \-gikglm 4“ gilgmk 4“ gimglki' 

Insofar as the totally antisymmetric tensor is concerned, it will be the differ¬ 
ence of the two expressions in square brackets given above for S ik i m . This 
difference is zero. So there is no totally antisymmetric tensor that can be 
constructed this way! 

Problem. If F ik is an antisymmetric tensor then show that 

7 = 9 F, k dF kl dF,j 
,kI dx 1 dx‘ dx k 

is a third-rank tensor. 
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Solution. Consider the tensor transformation law for F ik below: 

, _ dx‘ dx k 
F,nn ~ dx™ dx" Fik ' 

Differentiate with respect to x , and use the relation 

3 _ dx'P 3 
dx 1 dx 1 dx'P 


Thus we get 


3x^ 3 F' mn 
dx 1 dx'P 


dx‘ dx k dF ik 
dx"" dx'" dx 1 
dx’P d 2 x‘ 


dx k 


dx 1 dx'P dx' m dx'" 


Fa 


dx' 


d 2 x k 


dx'P 


Fa- 


dx' m dx'P dx'" dx 1 
Multiply both sides by dx 1 /dx' q and use the result dx' p /dx' q = S p . Then we 
get 

3T'„ _ dx 1 dx‘ dx k 3 Fa d 2 x‘ dx k 
dx' q dx' q dx' m dx'" dx 1 dx' q dx' m dx'" 
d 2 x k dx' 

■Fa- 


■ Fa 


dx' q dx'" dx m 

We will have similar expressions from the second and third expressions on 
the right-hand side of the relation for Zat'. 


F„ 


3 F' nq 

dx' dx k 

dx 1 

3 F tl 

+ 

d 2 x k 

dx' 

dx'"' 

~ dx"" dx'" 

dx ,q 

dx * 

dx lm dx'" 

dx' q 


d 2 x' 

dx k 

F,, 





dx' m dx' q 

dx’" 

r kh 




3 F' qm 

dx k dx 1 

dx' 

dF u 

+ 

d 2 x‘ 

dx' 

dx'" 

dx'" dx ,q 

dx"" 

dx k 

dx'" dx' q 

dx"" 


d 2 x‘ 


dx 1 


Fu. 


dx ,n dx' m dx' q 

When all three equations are added the antisymmetry of Fa ensures that the 
terms involving second derivatives cancel out and the result follows. 


Problem. Prove that 


JJkl c ,, _ 


= 2(S* r S' s - S k s S‘ r ). 


Solution. To prove this result we may resort to the properties of determinants. 
It easy to verify that 


Jjki 


8' P S‘ q S' S' s 

s j P H H s{ 

s k p S k Si Si 

8' P 8[ S[ S' s 
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Putting i = p, j = q gives 

8- 8‘ S‘ 

4 8J 8i 

8 ) 8 * 8 ] ' 

gi 8 1 8 l 

The result follows on evaluating the determinant. (It is simpler to use the 
expansion formula in terms of products of 2 x 2 determinants in the top two 
rows and the bottom two rows.) 


ijkl 

c c ijrs 


4 

si 

8 ? 

8 ‘ 


3.4 Concluding remarks 

We end this chapter with a note that the vectors and tensors described 
here are definable in any coordinate frame. Thus we are not restricted to 
inertial frames or to linear transformations between such frames. Clearly 
this machinery will be useful to us in general relativity, where we aim 
to describe physics and dynamics in any general reference frame. 

However, we need to proceed further along this track before address¬ 
ing those issues. We have to progress from the tensor algebra described 
above to tensor calculus, since equations of physics and dynamics 
require us to differentiate quantities with respect to space and time 
coordinates. Having done that, we also have to describe the essential 
features of a non-Euclidean spacetime. We will therefore first take up 
the question of how to describe a physically meaningful derivative of a 
tensor. 


Exercises 

1. Which of the following expressions are invalid with respect to the summation 

convention: (a) . ; . 1(b) g ik g ik , (c) R ik g ik , (d) e lklm e illm and (e) T ik g k ? 

Simplify those expressions that are valid. 

2. A ik is a tensor such that \\A ik || is non-singular. Show that the components of 
the inverse matrix transform as a tensor. (An example of this result is the tensor 

g ik -) 

3. If A/ is a vector, show that F lm = A m l — A I-m is a second-rank tensor. 

4. Show that e' ikl e ijkr = 68 l r . 

5. If £' is a vector field, deduce from first principles that 


— £ 


i Sgm, 
dx‘ 


T gm 


9T 

dx" 


+ gm 


9T 

dx'" 


is a tensor field. 



Exercises 57 


6. Find a coordinate transformation of the form 

R = R(r, t ), T = T(r, t ), 

which will transform the line element 

dr 2 


ds 2 = dr 2 - S 2 (0 


+ r 2 (d# 2 + sin 2 # dtp 2 ) 


. 1 - kr 2 

where S(t) is a function of t and k is a constant, to the form 

ds 2 = e v dr 2 - e A dR 2 - R 2 (d9 2 + sin 2 # dc/> 2 ). 

Deduce that 

7. A surface of revolution is generated by rotating the parabola 

v 2 = 4 ax 

about the x-axis. By writing x = at 2 , y = 2at or otherwise, show that the line 
element on this surface is given by 

dy 2 = 4n 2 [7 2 dtp 2 + (1 + t 2 )dt 2 ]. 

8. In the Kerr spacetime, the line element is given by 
' R 2 - 2MGR + h 2 ' 


ds 2 = 


R 2 + h 2 cos 2 # 
sin 2 # 


(dJ — h sin 9 dip) 2 


h 2 cos 2 # + R 2 
/ R 2 + h 2 cos 2 # 


[(7? 2 + h 2 )d<p - h dTf 


-^jdR 2 -(r 2 + h 2 cos 2 6)d9 2 . 


\R 2 — 2GMR + h 2 j 
An observer in this spacetime has constant R, 9, (p coordinates. Show that the 
world line of this observer has a timelike tangent, provided that 

R> GM + (i G 2 M 2 - h 2 cos 2 #) 1/2 . 

9. Given that and b ik are two symmetric tensors satisfying the relation 

jbkl Qilbjk + djkbil Uklbij = b, 

show that ciij = pb t j, where p is a scalar. 

10. From an antisymmetric tensor F ik is constructed its dual 

F *Im = l _ e ‘klmp 
2 

Show that the dual of F* lm is F ik . Is this result valid for all types of tensors? 

11. Show that the tensor 9 ik = g ik — U t U k when multiplied by any vector V k 
projects it into a 3-surface orthogonal to the unit timelike vector U t . 
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12. Define V* kl = e ijk , V‘. Then show that V t V' = ± V* kl V* Jkl . 

13. Show that the proper volume element dV — s/—g dx° dv 1 dv 2 dx 3 trans¬ 
forms as a scalar under general coordinate transformations. 

14. In Minkowski spacetime two focal points A and B are identified with Carte¬ 

sian coordinates x = a, y = 0, z = 0 and x = —a, y = 0, z = 0, t being the 
time coordinate. Let P be any point in space at t = constant and let and r 2 
be the distances of P from A and B. Let f \ (r, + r 2 ) and fj = — r 2 ). 

Suppose the azimuthal coordinate for the plane PAB around the axis AB is (j>. 
Then the Minkowski line element in these coordinates is 


ds 2 = c 2 dr 2 — (f 2 — rf)(^ 


d £ 2 

% 2 -a 2 


(H 2 - a 2 ){a 2 - n 2 ) 


What surfaces do § = constant and rj = constant represent? 


15. Show by a coordinate transformation that the spacetime metric 


ds 2 = c 2 dr — c 2 t 


^ + r 2 (dd 2 + sin 2 # d ijr) 

_ 1 + r 1 


represents the Minkowski spacetime. 

16. Show that c‘ k,m €t mpq e pq ik = 0. but 6 ,Mm 6i mpq e pqrs e rsik = 96. Can you for¬ 
mulate a general rule for the closed cycle of n such symbols multiplied together, 
where n can be odd or even? 



Chapter 4 

Covariant differentiation 


4.1 The concept of general covariance 

We begin this chapter by introducing the idea of a field in physics (to 
be distinguished from the ‘field’ in algebra that mathematicians talk 
about). The idea was popularized by Michael Faraday in the context of 
the electric and magnetic fields. Figure 4.1 shows what happens when 
iron filings are sprinkled in the vicinity of a bar magnet. The filings 
get distributed in a pattern somewhat like that in this figure. Faraday 
called these curves lines of force. If we imagine a magnetic pole placed 
anywhere on one of these lines, it will move along that line, being guided 
by the magnetic force on it. The lines of force therefore represent the 
‘magnetic field’ B both in strength and direction at any point in the 
vicinity of the magnet. In short the magnet generates a ‘field’ of B- 
vectors all around it, representing the force exerted by it on another 
magnetic pole. 1 

We generalize the concept of a vector field by defining a vector 
function of spacetime variables, so that at each point a vector is defined. 
This idea may further be generalized by having tensor fields as functions 
of spacetime coordinates. Thus we could argue that equations of physics 
involve fields related by partial differential equations. If we additionally 
require that these equations do not change their form under changes of 
coordinates (thereby being the same for all observers), then they should 
be represented by tensor fields. Thus general relativity assumes as a 


1 In reality there is no magnetic pole existing in isolation. But the concept of a line of 
force can nevertheless be explained this way. 
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Fig. 4.1. Lines of force of a 
magnet obtained by 
sprinkling iron filings. The 
tangent to a line of force at 
any point indicates the 
direction of the magnetic field 
B. The field strength is high 
where such lines appear 
crowded, as, for example, 
near the poles of the magnet. 



basic postulate that fundamental physics is described by such fields. 
This premise is stated in the following form: the laws of physics are 
generally covariant. This postulate greatly restricts the form of a physics 
equation. 


Example 4.1.1 Consider the differential equation 

d 2 <t> _ d 2 <p d 2 cp dtp dip 

3 1 2 dx 2 dy 2 dx dz 

This equation is not generally covariant; that is, it does not preserve the above 
form under a change of coordinates. On the other hand, the wave equation 
Dtp = 0 preserves its form, which is why it is found in various branches of 
physics. 


The mathematical implication of general covariance for spacetime 
geometry can be likewise understood. If we are to deal with non- 
Euclidean geometry, we will at some stage encounter the concept of 
curvature of spacetime. This is an example of the intrinsic properties 
of spacetime that require description independently of the choice of 
coordinates. Such concepts are best described in terms of vectors and 
tensors. 
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Thus, having established the need for tensors in formulating space- 
time structure as well as the physics in it, we have to ensure that any 
differential equations involving them should also be generally covariant, 
i.e., not depending on any specific choice of coordinates. In short, they 
also should be expressible as vectors and tensors. We will soon find that 
this is a non-trivial requirement, for the partial derivative of a tensor 
need not be a tensor. 


4.2 Parallel transport 

We begin the discussion of vector derivatives with the example of a 
vector field. Let B, (x k ) be a covariant vector field whose four components 
transform according to the rule in (3.19) at each point (x k ) where it is 
defined. Suppose B , is a differentiable function of (x k ). Do the partial 
derivatives dB,/dx k transform as a tensor? 

We have already seen that the derivatives d(p/dx k of a scalar trans¬ 
form as a vector. So at first sight the answer to the above question might 
be ‘yes’. Indeed, in special relativity we do encounter such results. For 
example, if A t is the 4-potential of the electromagnetic field (described 
in the four-dimensional language of special relativity), then dAj/dx k , 
for Cartesian coordinates (x, y, z) and the time t of(3.1), do transform 
as a tensor. In our more general spacetime with an arbitrary coordinate 
system, however, the answer to the above question is in the negative. 

This result is easily verified by differentiating (3.19) with respect to 
x" n . We get (by writing d/dx' m as dx n /dx' m ■ d/dx") 

35' dx‘ dx" 35, d 2 x‘ „ 

-/— — 77 '/— - “I - } - 77 (4-1) 

dx m dx k dx m dx" dx m dx k 

Thus, whereas the first term on the right-hand side does appear in 
the right form to make 3 Bj/dx n a tensor, the second term spoils the 
effect. It also gives a clue as to why this happens. The second derivative 

d 2 x‘ 

dx' 1 " dx' k 

is expected to be non-zero because, in general, the transformation coef¬ 
ficients in Equation (3.19) vary with position in spacetime. When we 
seek to construct the derivative 35, /dx", we have to define it as a limit: 

B‘(.x k + &x k ) - B‘(x k ) 
bx" 


dBj 

- = lim 

dx" 6x n —»o 


However, the two terms in the numerator transform as vectors at two 
different points, and because of the variation of the transformation coef¬ 
ficients with position their difference is not expected to be a vector. ( The 
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s, + 5S, s. + t jB, 



P(**) 


Fig. 4.2. The parallel trans¬ 
port of the vector 6, at P to Q 
shows, in general, different 
components 6/ + 68/ at Q (see 
the dotted vector). In the case 
of a vector field, the compo¬ 
nents specified at Q, namely 
Bj + d 6;, would be different 
from those (at Q) obtained by 
parallel transport. 
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difference of two vectors is a vector provided that both are so defined at 
the same point.) 

This situation is illustrated in Figure 4.2. P and Q are the two neigh¬ 
bouring points (x k ) and ( x k + bx k ), with the vectors B t shown there with 
continuous arrows. In order to describe the change in the vector from P 
to Q, we must somehow measure the difference at the same point. How 
can this be achieved? 

This is achieved by a device known as parallel transport. Assume 
that the vector Bj at P is moved from P to Q, parallel to itself, that is, as 
if its magnitude and direction did not change. In Figure 4.2 this is shown 
by a dotted vector at Q. The difference between the vector Bj(x k + bx k ) 
and this dotted vector is a vector at Q and this tells us the real physical 
difference in the vector from P to Q. So we may after all be able to 
define a process of differentiation of vectors, provided that we know 
what happens to B, during a parallel transport from P to Q. 

First we have to note that the dotted vector at Q need not have the 
same components as the undotted vector at P. It is only with Cartesian 
coordinates that the components are the same. 


Example 4.2.1 Consider the Euclidean plane with a polar coordinate sys¬ 
tem. A vector A at a point P with coordinates {r, 9) has components A r 
and A e in the radial and transverse directions. If we now move the vector 
parallel to itself from P to a neighbouring point Q with polar coordinates 
(r + 6 r, 8 + b9), as shown in Figure 4.3, the radial and transverse directions 
at Q will not be parallel to those at P. Hence after parallel transport of A from 
P to Q its radial and transverse components at Q will be different from A r and 
Ae• 

A simple calculation using the geometry of infinitesimal rotation shows 
that the components of A at Q are A r + bd Ag and Ag — bd A r . 


Taking a cue from this example for our general case, we see that 
the changes in the components of Bj through parallel transport will be 
proportional to the original components Bj , and also to the displacement 
bx k in position from P to Q. We may express the change as a linear 
function of both these quantities and the most general form that we can 
have for it is 


bBj = T \ k B, bx k , 


(4.2) 


where the coefficients T l ik are, in general, functions of space and time. 
These quantities are called the three-index symbols or the Christoffel 
symbols. 



4.2 Parallel transport 
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Fig. 4.3. The vector A at P is 
parallely transported to Q; but 
its components in the polar 
coordinates (r, 0) are different 
at P and Q. This is because 
(unlike the Cartesian 
coordinates) the local 
directions of r = constant and 
8 = constant change on going 
from P to Q. 


Notice that the introduction of (4.2) involves something new in addi¬ 
tion to the introduction of the metric. The metric tells us how to measure 
distance between neighbouring points, whereas T' kl in (4.2) tells us how 
to define parallel vectors at neighbouring points. This property of con¬ 
necting neighbouring vectors through the concept of local parallelism is 
often called the affine connection of spacetime. 

The reader may be worried at this stage as to how parallelism can 
be assumed when, as we saw in Chapter 2, the concept of parallel lines 
in non-Euclidean geometries is non-trivial. So we clarify that we are 
talking here of local parallelism, i.e., of parallelism over infinitesimal 
distances. Indeed, as we will elaborate later in Chapter 5, the ‘local’ 
region in the neighbourhood of any observer can be approximated by 
‘flat’ space or spacetime, where the concepts of Euclid’s geometry hold. 

There is a practical way of describing parallel propagation in the 
following fashion. 

We take the example of a sphere. Suppose T is a curve drawn on 
the spherical surface connecting points Pi and P 2 . The arrow shown in 
Figure 4.4 represents in magnitude as well as direction a vector A! at P!. 
Flow do we transport it parallely to P 2 along I'? Imagine a plane touching 
the sphere at Pi, with the vector Ai mapped at the corresponding point Q 1 
on the plane. (By mapping, we mean that the magnitude and direction of 
the original vector on the sphere and the mapped vector on the tangent 
plane should match.) Let the vector in the plane be called A]. Now 
carefully roll the sphere on the plane so that it keeps touching it along 
the successive points of F. When you reach P 2 , stop there. Let the 
corresponding point on the plane be Q 2 . Draw a vector A 2 at Q 2 parallel 
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Fig. 4.4. A practical way of 
finding the directions of a 
vector A parallely transported 
along a specified curve (r) 
from Pi to P 2 on a sphere. 
(See the text for details.) 


to the starting vector Ai on the plane at Qi. Next map this vector 
onto vector A 2 at P 2 on the sphere. This will be the required parallely 
transported vector at P 2 . This method can, in principle, be used for other 
surfaces also. 


4.3 The covariant derivative 

Returning to (4.2), we see that the difference between the continuous 
and the dotted vectors at Q is given by 

- r‘ k B^j bx k . (4.3) 

We may accordingly redefine the physically meaningful derivative of a 
vector by 

Bi * s S " r ‘ kB ' = Buk ~ r ‘ kB " {4A) 

This derivative, by definition, must transform as a tensor. It is called the 
covariant derivative and will be denoted by a semicolon, as against the 
ordinary derivative, which will be denoted by a comma. 

If B, y- must transform as a tensor, the coefficients T l kl have to trans¬ 
form according to the following law: 

w , dx" dx" dxP d 2 x p dx" 

p" — _ _ _p " 1 -|__ (4 51 

u dx m dx* dx" np dx*dx" dx p ' v ' ; 

This result can be verified after some straightforward but tedious calcu¬ 
lation. 


B t (x k + bx k ) - [BAx k ) + bB i ]=\ d ^ 
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Example 4.3.1 Problem. T l kl and fjy are two affine connections defined in 
a spacetime with the same metric tensor. Show that Y‘ u — T' kl transforms as 
a tensor. 


Solution. Under a general transformation from x' to x" coordinates we have, 
from Equation (4.5), 


dx n dx" dxP d 2 xP dx n 

-T m 4-• - 

dx'" dx* dx' 1 np dx* dx n dx p 


and also 


dx' 1 dx" dxP ~ d 2 xP dx 

p// _ pm _i_ 

kl ~ dtp" dx* dx* "P dx' k dx' 1 ' dxP' 


On taking the difference of these two relations, we get 

dx" dx" dx p ~ 

kl kt> dx m dx* dx" np np 

This is the transformation law of a third-rank mixed tensor. Hence the result 
follows. 


A scalar, of course, does not change under parallel transport, which 
is why dcp/dx k transform as a vector. If we use this result we see that, for 
any arbitrary vector fields A' and B‘, (A 1 5,)t is a vector. This property 
enables us to construct the covariant derivative of a contravariant vector 
A‘ as follows. 

We have (A'B,). k = A\ k B t + A'B l:k = (A‘B t \ k = A\ k B, + A*B a 
and using (4.4) we get A' k Bj — A‘ k B, + Since B, is an arbitrary 

vector, we must have 

d a' 

4 =^ + r ik^ = Ai ,k + r ikd- (4.6) 

The rule of covariant differentiation of a tensor of arbitrary rank is 
easily obtained: we introduce a (+T) term for each contravariant index 
as in (4.6) and a (—T) term for each covariant index as in (4.4). Thus, 
for the metric tensor we have 

sm = ^ - r f /gpl - rp gip . (4.7) 

4.4 Riemannian geometry 

Einstein used the non-Euclidean geometry developed by Riemann to 
describe his theory of gravitation. The Riemannian geometry introduces 
the additional specification that 


git,i = 0 . 


(4.8) 
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Note that, as defined in the previous section, the affine connec¬ 
tion need not satisfy these conditions. Indeed, geometries for which the 
above relations are not satisfied also exist. For the theory of relativity 
and Riemannian geometry, however, these conditions are additionally 
assumed. 

Going back to (4.7) we see that g ik; i = 0 gives us 40 linear equations 
for the 40 unknowns r[ z . These equations have a unique solution. For, 
from (4.7) and (4.8), we get 

T m + r,|« = ga.i, (4.9) 

where 

r m = gpk rf,. (4.10) 

Rotate the indices cyclically to obtain two more relations: 

T/iii + T m = g kki , r m + r = gn k . 

Next use the symmetry condition in (4.8) to eliminate T i\ ki = T/| z * and 
T m — V m from the above three relations to get 


2r, 1H = g lk j + giij t — gu,i- 

On raising the index i we get the required solution: 

pi _ jj_ /m ( d§mk dglm _ dgkl \ 

kl ~ 2 s V dx‘ 8x k dx m )' 


(4.11) 


In other words, once the metric tensor is known, the Christoffel 
symbols are fully determined. 
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4.5 Some useful identities 


We next consider some particular identities relating to the Christoffel 
symbols, that are useful in various manipulations. If we differentiate the 
determinant of the metric tensor we get 

dg = gg ik dg ik . (4.12) 


This relation is useful in expressing some combinations of T[ 7 and 
covariant derivatives in relatively simple forms. Thus, using (4.9) and 
(4.10), it is possible to prove the following relations: 


r', 
r \ k g ik 
4 


1 3 __ 

-j= W-gX 
V-g 9x- 


1 


^(V-gg"4 


V=g dx 

1 A(V=^0, 


J-g dx‘ 


(4.13) 


F% = -L 4-(J-gF ik ) for F ik = -F ki . 

J~g dx k 

(Here A' and F lk are respectively vector and tensor fields.) For example, 
to prove the first relation note that (4.11) gives, with k = i. 


T ,7 ^ 4 ( gmi.l T glm,i gil,m\ 

Since (g/ m ,i — g,i, m ) is antisymmetric in (/,»;), its product with the 
symmetric g"" vanishes. The result then follows when we recall (4.12). 


Example 4.5.1 Problem. If a vector satisfies the relation £, l m gi„ + 
%‘ n glm + gmn.iH' = 0, show that = 0. 

Solution. Consider f — r , mp ^ p . We then have 

gln(4. ~ r lpS P ) + 4n - F l np H P )gl m + = 0, 

i.e.. 


^n;m 4“ Hm',n T gmn.lH n\>/i[A Ffn\np^ — 0. 

Since T„\ mp + F m \ np = gmn.p , the result follows. 

Problem. Show that g'" — — g mk g ll gki,n and hence deduce that g‘™ = 0. 
Solution. Differentiate with respect to x" the identity g lm g m i = b\, to get 

g im „gml + g im gml.n = 0. 


Multiply by g ,k and use the identity g m ig lk = b k m to get 


g ik n +g im g' k gnl,n = 0 . 


Change the index k to m, l to k and m to / to get the required answer. 
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4.6 Locally inertial coordinate systems 

The symmetry condition in (4.8) enables us to choose special coordinates 
in which the Christoffel symbols all vanish at any given point. Suppose 
we start with T™ ^ 0 in the coordinate system (x‘) at point P. Let 
the coordinates of P be given x' P . Now define new coordinates in the 
neighbourhood of P by 

x' k = ~f k nm (x" - x")(x m - x'"). (4.14) 

Then we have at P 

ax"' aV 

r n — n _= o _= — r' 

p ' ax m a.x"3x'" 

with the result that from (4.5) 

rllr = o. 

Further, by a linear transformation we can arrange to have a coordi¬ 
nate system with 

git = flik = diag(+l, — 1, — L — 1), r‘ kl = 0 (4.15) 

at our chosen point P. Such a coordinate system is called a locally iner¬ 
tial coordinate system, for reasons that will become clear later. Apart 
from its physical implications in general relativity, the locally inertial 
coordinate system is often useful as a mathematical device for simpli¬ 
fying calculations. We also warn the reader that the operative word is 
‘local’: the simplifications implied in (4.15) cannot be achieved globally. 
What prevents us from achieving a globally inertial coordinate system? 
In seeking an answer to this question we encounter the most crucial 
aspect in which a non-Euclidean geometry differs from its Euclidean 
counterpart. 

Exercises 

1. Show that if T l kl ^ T l lk the condition gitj = 0 implies that 

r (kl) 2^ ( 8mk,l “f gml,k 8kl,m ) 8 8kn^[l m ] 

_i_ fJm pn 

i 8 8ln t [km]’ 

2. Under a conformal transformation gik changes to e 2a gik, where a is a 
real twice-differentiable function of spacetime coordinates. Show that the new 
Christoffel symbols are given by 

r kl = r kl “f ^t^yk — gklg G,m- 
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3. For a symmetric tensor field A ik show that 


4. Show that, for a scalar field (f >, the wave operator takes the form 


□0 = g ik <p. ik = 


i d 

V~g dxk 


-gg 


d(j> 

dx 


0- 


5. Show that, to arrive at a locally inertial system, it is necessary to have 

F' — F ! 

1 kl ~ 1 Ik- 

6. Show that 

3 g _ pi rr m ^ rr m i 

A ffj/6 A m/o 

7. In polar coordinates (r, 9), the radial and transverse accelerations are 
(r — rO 2 ) and r9 + 2r9. Try to relate these expressions to the notion of 
covariant differentiation described in this chapter. 

8. Set up equations of parallel propagation of a vector along a latitude line 
9 = 60° on a unit sphere. By what angle has a vector initially directed along 
the longitude line at zero longitude (</>) turned by the time it has gone half-way 
round the latitude circle? 


9. Show that Maxwell’s equations are invariant under the conformal transfor¬ 
mation g ik g ik e 2a . 

10. A vector is moved parallel to itself along the line t — constant on the 
paraboloid of revolution whose metric is given by (see Exercise 7 of Chapter 3) 


d? 2 = 4a 2 [(l + t 2 )dt 2 + t 2 d0 2 ]. 


Initially the vector was lying tangentially to the line t = constant. Show that on 
moving round once from (f> = 0 to <f> = 27r the vector makes an angle 2jrVf 2 + 1 
with its initial direction. 



Chapter 5 

Curvature of spacetime 


5.1 Parallel propagation around finite curves 

Figure 5.1 repeats the previous example of non-Euclidean geometry on 
the surface of a sphere which we discussed in Section 2.2 of Chapter 2. 
We have the triangle ABC of Figure 2.3 whose three angles are each 90°. 
Consider what happens to a vector (shown by a dotted arrow) as it is 
parallely transported along the three sides of this triangle. As shown in 
Figure 5.1, this vector is originally perpendicular to AB when it starts its 
journey at A. When it reaches B it lies along CB; it keeps pointing along 
this line as it moves from B to C. At C it is again perpendicular to AC. So, 
as it moves along CA from C to A, it maintains this perpendicularity, 
with the result that when it arrives at A it is pointing along AB. In 
other words, one circuit around this triangle has resulted in a change of 
direction of the vector by 90°, although at each stage it was being moved 
parallel to itself! 

A similar experiment with a triangle drawn on a flat piece of paper 
will tell us that there is no resulting change in the direction of the vector 
when it moves parallel to itself around the triangle. So our spherical 
triangle behaves differently from the flat Euclidean triangle. 

The phenomenon illustrated in Figure 5.1 can also be described as 
follows. If we had moved our vector from A to C along two different 
routes - (i) along AC and (ii) along AB followed by BC - we would have 
found it pointing in two different directions in the two cases. In fact, if 
we had taken any arbitrary curves from A to C we would have found that 
the outcome of parallel transport of a vector from A to C varies from 
curve to curve; that is, the outcome depends on the path of transport 
from A to C. 
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Fig. 5.1. A figure illustrating 
the parallel-transport 
problems on a spherical 
surface described in the text. 


Recall that we had introduced the concept of parallel transport with 
the proviso that we would apply it to transport along an infinitesimal 
path. We are now witnessing the consequence of breaking that fiat and 
taking the concept to finite lengths. The above example shows that the 
result of parallel transport is path-dependent. 

This is one of the properties that distinguishes a curved space from 
a flat space. Let us consider it in more general terms for our four¬ 
dimensional spacetime. Let a vector B, at P be transported parallely to 
Q and let us ask for the condition that the answer should be independent 
of the curve joining P to Q. (See Figure 5.2.) We have seen that, under 
parallel transport from a point {x ! } to a neighbouring point {x' + 6x ! }, 


e,(r) 


v -- Q 



Fig. 5.2. A vector B at P, 
transported along two curves 
r and r' to Q, ends up having 
different directions at Q, as 
shown here. The dotted vector 
was obtained by moving 
along T and the continuous 
one by moving along r'. 
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the components of the vector change according to (4.2). If it were pos¬ 
sible to transport B , from P to Q without the result depending on which 
path is taken, then we would be able to generate a vector field B‘(x k ) 
satisfying the differential equation 



(5.1) 


So the answer to our question depends on whether we can find a non¬ 
trivial solution to the system of four differential equations (5.1). 

A necessary condition for the existence of a solution is easily derived. 
We differentiate (5.1) with respect to x n to get 


3 2 5, 
dx" dx k 


l r »a> = S?» + r 


dB, 

dx" 


Here we have repeatedly used the relation (5.1) to eliminate derivatives 
of Bj. We now interchange the order of differentiation with respect to 
x” and x k and use the identity B ink = B itkn . We then get the required 
necessary condition as 

Ri m knB m = 0. (5.2) 


Here the four-indexed symbol R, as defined in (5.3) below, is inde¬ 
pendent of the vector B m and so we conclude that the spacetime must 
satisfy the condition 




911 

dx" 


djz 

dx k 


_i_ r' r m — r 7 r m — o 

+ 1 ik l In 1 in 1 Ik ~ U - 


(5.3) 


It is not obvious simply from the above expression that Rj' n kn should 
be a tensor. Yet our result, in order to be significant, must clearly hold 
whatever coordinates we employ to derive it. So we do expect R, m k „ to 
be a tensor. A simple calculation shows that, for any twice differentiable 
vector field Bj, 


(5.4) 


Since the left-hand side is a tensor, so is the right-hand side and, B m 
being an arbitrary vector, we have, by virtue of the quotient law stated 
in Chapter 2, the result that R t m k „ are the components of a tensor. 1 

This tensor, known as the Riemann—Christoffel tensor (or, more com¬ 
monly, the Riemann tensor, or the curvature tensor), plays an important 
role in specifying the geometrical properties of spacetime. Although we 


1 The quotient law requires the vector B m to be arbitrary with respect to the left-hand side. 
Isn’t Bi - n k connected with B m ? The derivatives of Bj at any given point can be arbitrarily 
specified, even if B[ at that point is known. Thus the condition of arbitrariness is met. 
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have derived (5.4) as a necessary condition, a slightly more sophisticated 
technique shows that (5.4) is also the sufficient condition that a vector 
field Bj(x k ) can be defined over the spacetime by parallel transport. We 
will not, however, go into the detailed mathematical proof here. The 
interested reader may look up Reference [7] listed at the end. 

Spacetime is said to be flat if its Riemann tensor vanishes every¬ 
where. Otherwise, it is said to be curved. In the curved spacetime the 
identity (5.4) can be generalized to a tensor of any rank by including 
a term containing the Riemann tensor for each free index of the given 
tensor. 


Example 5.1.1 Problem. A contravariant vector on the surface of a unit 
2-sphere with polar coordinates 9, (p (9 = 0 being the north pole and 9 = 
n /2 the equator) is parallely transported along the equator from <p = 0 to 
(p = 7r/2, then similarly transported along the meridian (cp = constant) from 
9 = Ti/2 to 9 = tt/3 and then along the latitude ( 9 — constant) from (p — 
7r/2 to <p = 0. Finally, it is transported similarly along the meridian back 
to the starting point 9 = jr/2, 0 = 0. Show that, if the affine connection is 
Riemannian, the vector now makes an angle jr/4 with its initial direction. 
(See Figure 5.3.) 



Fig. 5.3. The closed circuit described in the text is shown in this figure. 
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Solution. On the unit sphere we have 

ds 2 = d<9 2 + sin 2 0 dtp 2 . 

Thus with x 1 = 9, x 2 = <p we have gn = 1, g 22 = sin 2 9. The non-zero 
Christoffel symbols are Ti | 2 2 = —sinff cos 9, r 2 \u = sin@ cos 9, i.e., F\ 2 = 
—sin 8 cos 8 and T 2 2 = cot 9. 

Let a unit vector (zz 1 , u 1 ) be parallely transported along a curve 0 = 9{X), 
tp = tp(X), where A. is a parameter. Then the transport equations are 

dr/ 1 , , dtp dr/ 1 . dtp , 

-1- r„r/ 2 — =-sin 9 cos 8 — ir — 0 

dA 22 dA dA dA 


and 


i.e., 


du 2 , 

— + r?, 

dA 12 


dtp 


d 9 


zz-1- u~ — = 0, 


dA 


dA 


dir ( . dtp , dd \ 

-1- cot 9 ( r/ 1 - hr/ - — ) = 0. 

dA y dl dA J 

The equator is given by 9 = n/2. So along the equator dr/'/dA = 0, 
dr/ 2 /dA = 0, i.e., u l = constant = z/q and u 2 = z/q, where (z/q, ul ) is the 
starting value of the vector. Let us assume that it is a unit vector to start with. 

Along the meridian tp = n/2, dtp/dX = 0 so that rr 1 = constant = z/q . 
Also, du 2 /dX + cot 8 u 2 dd/dX — 0, i.e., r/ 2 sind = constant = z/q. Thus the 
vector at 6 = tt/3, tp = tt/2 is (z/q, z/q cosec(7r/3)), i.e., (z/q, (2/-v/3)zZq). 

Then along the latitude 9 = jr/3 we have d8/dX = 0 and the two trans¬ 
port equations are 


dr/ 


_V3# 2 _ n 

dA 4 dA U 


du 2 1 , dtp 

— + ~^u l -r- =0, 
dA </J dA 


dzz 1 
dtp 


V3 


dir 

dtp 




V3 


Differentiate the first with respect to tp and use the second equation: 


dV 

dtp 2 


V3 dzz 2 
~4 


1 


= —zz 
d tp 4 


1 . I <P 

i.e., zz = A cos I — 


where A and B are arbitrary constants. 
Also, we have 

4 dzz 1 


zz = —= —— =- r=A sin — 4- -=B cos — 


V3 dtp V3 


+ B sin — 


V3 


At tp = n/2, z/ 1 = (1/V2)^4 + (1/V2),S, i/ 2 = sf\B - sJ\A. 

Since this vector is (z/q, 1/2/ )z/q), we have A + B = \/2z/q, B — A = 
\/2z/q. So A = (I/v^Xz/q — 1/5) and 5 = (1/V2 )(z/q + z/q). At tp = 0 the 
vector is given by zz 1 = (z/J — z/q)/\/2, zz 2 = y/|(zzj + z/q). 
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For transport along <j> = 0 the equations are u 1 = constant, 
u 2 sin 9 = constant. So we have u 1 — (tig — ul)/V2, ir sin 6 = \Zf(»o + 
Uq)(\/3/2) = (i/q + Mq)/\/ 2. Thus at 0 = n/2 we have w 1 = (i/q — Wq)/\/ 2, 
u 2 = («q + Mq)/\/2. It can easily be verified that («', w 2 ) is a unit vector. 
The angle made by this vector with the initial vector (wj, i/q) is given by i/r, 
where 

COS t/r = glli/QU 1 + g22U 2 0 U 2 = + !/q« 2 

= ^ - «o) + ! 'o(“o + «o)| 

1 

= 7 !’ 

for («q, i/q ) is a unit vector. Hence \[r = n/4. 


5.1.1 Symmetries of 

It is more convenient to lower the second index of the Riemann tensor 
to study its symmetry properties. Since the symmetry or antisymmetry 
of a tensor does not depend on what coordinates are used, it is more 
convenient to write (5.3) in the locally inertial coordinates (4.15). We 
then get 


1 

Riklm — -j{gkl,im T gim,kl gkm,il gil,km )• (5.5) 

From this expression the following symmetries are immediately obvious: 

Riklm = Rkilm = R-ikml = Rlmik• (5-6) 

We also get relations of the following type: 


Riklm + Rimkl T Rilmk — b* 


(5.7) 


If we take all these symmetries into account, we find that of the 4 4 = 
256 components of the Riemann tensor, only 20 at most are independent! 
For, consider the first pair (i , k) in Rai„ . It has altogether six independent 
combinations, in view of the antisymmetry of Rai m with respect to 
(i, k). Similarly the last pair (/, m ) has six independent values. Since, 
from (5.6), R,*./,,, = R /,,,;k we have altogether 6 x 7/2 = 21 independent 
components of this tensor. However, Equation (5.7) generates one more 
constraint, reducing the above number to 20. Moreover, we will soon 
see that there are identities linking their derivatives too. 
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5.1.2 The Ricci and Einstein tensors 

By the process of contraction we can construct lower-rank tensors from 
R-ikim ■ The tensor 


R,n = R,' 


(5.8) 


is called the Ricci tensor. If we use the locally inertial coordinate system, 
we immediately see that R in — R ni . In a general frame we get 


Rill — 


9 2 Iny^g 

dx' dx" 


9TT 

dx™ 


+ r'„ 


pm p/ 

1 In 1 i# 


dx 1 




(5.9) 


Owing to the symmetries of (5.6) there are no other independent second- 
rank tensors that can be constructed out of Raim ■ 

By further contraction we get a scalar: 


R = g il R* = R\- 


(5.10) 


R is called the scalar curvature. The tensor 

G ik ^ Ra ~ \gikR (5.11) 

will turn out to have a special role to play in Einstein’s general relativity. 
This tensor is called the Einstein tensor. 


Example 5.1.2 Problem. If F ik is an antisymmetric tensor, then F ,k ik = 0. 
We have from antisymmetry 


However, 


jpik TT'ik _ ni -omk nk j-'im 

^ ;ik ^ ;ki ^ mki ^ ^ mki ^ 

= RmkF mk + R,ni F' m = 0. 


Thus F'Jl = F' ki . Hence the result follows. 


Example 5.1.3 Problem. A two-dimensional space has a metric given by 
ds 2 = gntdv 1 ) 2 + g 22 (dv 2 ) 2 . Show that Ruga = ^ 22 gn and R l2 = 0, and 
that the Einstein tensor is identically zero. 

Solution. We have g 11 = 1 /g\ \ , g 22 = 1 /g 2 2 and g = gi ig 22 • The non-zero 
Christoffel symbols are 


r i _ gin 

11 ” 2g„’ 


r l _ gll.2 
21 “ 2g„ ’ 

2 _ g22,2 

1 22 — n. > 

2g22 


r 2 

1 ii 


gl 1,2 
2g22 


1 _ 

22 — 


g22.1 

2gll 


p 2 _ g22,l 
12 2g 22 
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A simple but tedious calculation then gives R n = 0 and 

^ _ g 2 2,ll + 11,22 _ (g22,l) 2 _ (gll,2.) 2 

H 2g 22 4 g\ 2 4g n g22 

gll,2g22,2 gll,lg22,l 

4gf 2 4 gng22 ' 

On writing Ru/gn one can easily check by interchanging 1 and 2 that the 
outcome equals g 22 /g 22 - Thus the required result follows. 


5.1.3 Bianchi identities 

The expression (5.5) suggests another symmetry for the components of 
Rikim ■ This symmetry is not algebraic but involves calculus. In covariant 
language we may express it as follows: 

Riklm;n T Riknl;m T Rikmnj — 0. (5.12) 

These relations are known as the Bianchi identities. Their proof is 
most easily given in the locally inertial system as in (5.5). Simply write 
the expressions for the three Rs in (5.12) in terms of the third derivatives 
of the metric tensor. 

But multiplying (5.12) by g""g kn and using (5.8)—(5.10), we can 
deduce from these identities another that is of importance to relativity: 

=0. (5.13) 

In other words, the Einstein tensor G lk has zero divergence identically. 


Example 5.1.4 Problem. Show that, if Rm m — K(gugkm ~ gimgki), then 
K = constant. 

Solution. We have R u = g ,m K(g,ig km - gimgki ) = -3 Kg u . 

Therefore R = —12 K and the Einstein tensor is 

G ik = -3 Kg ik + 6Kg ik = iKgik 

Since Gif = 0, we have 

(3Kg ik ),k = 0, 


since gif = 0. Thus K — constant. 





Fig. S.4. The tangent vector 
to a geodesic does not change 
its direction. In the figure the 
tangent vectors at points 1 and 
2 are technically parallel, once 
we take the non-Euclidean 
geometry into account. 


P 2 



Pi 

Fig. 5.5. Fora geodesic 
connecting Pi and P 2 , all 
lines joining these points will 
have the same length for small 
displacement 
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5.2 Geodesics 

So far we have talked about non-Euclidean geometries without men¬ 
tioning whether, in general, they have the equivalents of straight lines in 
Euclidean geometry. We now show how equivalent concepts do exist in 
the Riemannian geometry under consideration here. 

There are two properties of a straight line that can be generalized: 
the property of ‘straightness’ and the property of ‘shortest distance’. 
Straightness means that, as we move along the line, its direction does 
not change. Let us see how we can generalize this concept first. 

Let x l (k) be the parametric representation of a curve in spacetime. 
Its tangent vector is given by 


dx‘ 

dT‘ 


(5.14) 


Our straightness criterion demands that u l should not change as it moves 
along the curve. (See Figure 5.4.) In going from k to X + 5k, the change 
in it' is given by 


d u l , , 

Au ’ = - 5 A + T'u k bx . 

dA. 

The second expression on the right-hand side arises from the change 
produced by parallel transport through a coordinate displacement Sx 1 . 
However, since the displacement arises from k changing to k + 5k, we 
have Sx 1 — u 1 5k. Therefore the condition of no change of direction u‘ 
implies An 1 — 0; that is, 


^ + r >V = 0 . (5.15) 

This is the condition that our curve must satisfy in order to be straight. 

The second property of a straight line in Euclidean geometry is that 
it is the curve of shortest distance between two points. Let us generalize 
this property in the following way. Let the curve, parametrized by k, 
connect two points Pi and P 2 of spacetime, with parameters A.] and k 2 , 
respectively. Then the ‘distance’ of P 2 from Pi is defined as 

dx' \ f Xl 

a *dT dlj Ldx ’ (516) 

say. We now demand that s(P 2 , Pi) be ‘stationary’ for small displace¬ 
ments of the curve connecting Pi and P 2 , with these displacements 
vanishing at Pi and P 2 . (See Figure 5.5.) 


s(P 2 ,Pi) = 
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This is a standard problem in the calculus of variations, and its 
solution leads to the familiar Euler-Lagrange equations 


d f 9L\ dL 
dA y dx‘ J dx' 


(5.17) 


where x' = dx'/dA and L = [g,^(dx'/d7.)(dx: S: /dA.)] 1/2 is a function of 
x' and x'. It is easy to see that (5.17) leads to 


d / 1 d.f\ 1 1 Ax'" dx" _ 

dA L fix) ~ 2 8 " mJ L ~dT dT “ °‘ 

If we substitute 


ds = L dA 


(5.18) 


and use (4.9), we get the above equation in the form 


d 2 x‘ 

d? 2 


+ ri 


dr* dx-' 
ds ds- 


(5.19) 


There are a few loose ends to be sorted out in the above derivation. First, 
L would be real only for timelike curves. Thus, if we want to use a real 
parameter along the curve, then for spacelike curves we must replace ds 
by 


da = i ds, i = \/^T. 


(5.20) 


For null curves, L — 0. The above treatment therefore breaks down. 
It is then more convenient to replace the integral (5.16) by another, 
namely, 

M 2 

1= L 2 dA, (5.21) 

J Aj 

and consider 6/ = 0. We can always choose a new parameter X = A (A) 
such that the equation of the curve takes the same form as (5.19), with 
A replacings. 

It is easy to see that (5.19) is the same as (5.15). Although s in 
(5.19) has the special meaning ‘length along the curve’ while A in (5.15) 
appears to be general, it is not difficult to see that, if (5.15) is satisfied 
A must be a constant multiple of s. This is because (5.15) has the first 
integral 

dx* dx k 

gik - 77 - -TT- = C, C = constant. (5.22) 

dA dA 

These curves of ‘stationary distance’ are called geodesics. For timelike 
curves C > 0 and for spacelike curves C < 0, while for null curves 
C — 0. A is called an affine parameter. 



o 



O' 


Fig. 5.6. On a spherical 
surface the separation PQ 
between two neighbouring 
longitudes increases on going 
from pole O to the equator but 
decreases as we move from the 
equator to the other pole O'. 
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Example 5.2.1 Let us calculate the null geodesic from t = 0, r = 0 to the 
point t = T,r = R,8 = 9\,<p = (j >i in the de Sitter spacetime 

d? 2 = c 2 dt 2 - e M '[dr 2 + r 2 (d<9 2 + sin 2 © d0 2 )]. 


where H = constant. It is not difficult to verify that the 6 and 0 equations 
of (5.19) are satisfied by 8 = 9\, <j> = <f> u both 8i and 4> i being constants. 
That is, our straight line moves in the fixed ( 9 , <p) direction. The t equation 
simplifies to 


d 2 ? H 


dA 2 


+ -re z 




The first integral (5.22) gives, on the other hand, for ds = 0 


Y = e 2 

dA J 


The two equations can be easily solved to give 


t 




c X 
HX + X o’ 


where Xo is determined from the boundary condition that when r = R, t — 
T. Note that a solution is possible only if R and T are related by the condition 

R = -(1 - e~ HT ). 

H ’ 


5.3 Geodesic deviation 

We end this chapter by describing another geometrical feature that distin¬ 
guishes a flat spacetime from a curved one. Again we take the spherical 
surface as illustrative of a curved space. 

Imagine, as in Figure 5.6, longitude lines drawn between the poles 
on a spherical Earth. We know and can verify by using Equation (5.19) 
that the longitude lines are geodesics. Now consider two points P and 
Q on two neighbouring lines of this set, located at the same distance 
from the nearest pole. As both P and Q move away from the pole the 
distance between them at first increases. However, the rate of increase is 
not uniform; it is rapid at first but slows down until it is maximum when 
P and Q are on the equator. Thereafter the distance PQ decreases to zero 
as the other pole is reached. 

This behaviour is different from that for geodesics drawn on the flat 
space of a Euclidean plane. Figure 5.7 shows straight lines drawn from a 
point O with points P and Q on two neighbouring geodesics, i.e., straight 
lines. In this case it is easy to verify that the distance PQ increases at a 
uniform rate with respect to the distance of the pair from O. 
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So one may say that the rate at which two neighbouring geodesics 
deviate from each other gives us information on the curvature of space 
and time. The general result valid for Riemannian geometries is derived 
below. 

Let a bundle of geodesics in a general Riemannian spacetime be 
specified by a parameter /x, so that a typical point on the /x-geodesic 
has the coordinates x k (X, /x), X being the affine parameter, as shown in 
Figure 5.8. The vector v k = 3x A '/3/x denotes the rate of deviation from 
one geodesic to another across the bundle. We first show that 

v k . t u l = u k .iV l , where u k = dx k /dX. (5.23) 



Fig. 5.7. On a flat surface 
the separation PQ between 
two geodesics (straight lines 
as drawn here) increases 
uniformly as P and Q move 
furtherfrom the origin O of the 
geodesics. 


The proof is simple from first principles. We have 

rj i j k f)^x k 

v -‘ u = M U +r,mv u = ^ +r "" v “• 
Similarly, we also have 

u k ,v l = ^ v' + r k lm u™v‘ + T k lm u™v'. 

’ dx‘ dfldX 


Since the order of partial differentiation with respect to X and /x can be 
interchanged and also because T k m = T k ml , the result follows. 

We next show that 


dV/dA 2 + R k lm „u‘v m u n = 0. (5.24) 

In view of the equality proved above, the first term on the left-hand 
side may be written as 

(v k .,u%u m = («*./) , m u m = u k . lm v‘u m + u k . I v , . m u m . 



Fig. 5.8. In the bundle of 
geodesics the X-parameter 
increases as one moves to the 
right on any geodesic. The 
parameter ii increases in the 
orthogonal direction. The 
separation v between two 
neighbouring geodesics tells 
us whether they are coming 
closer or moving apart. See 
the text for details. 
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Using the identity (5.4), the first term on the right-hand side may be 
replaced by 

u k . ml v‘u m - R k jlm u i v , u m . 

Therefore we have 

dV/dA 2 + R k ilm u‘v l u m = u\ ml v'u m + u k .,v‘. m u m . (5.25) 

Consider the first term on the right-hand side of the above equation. 
By interchanging the dummy indices /, m it can be rewritten as 

u k lm v m u l = [u k .ju'l m v m - u k ju‘. m v m . 

Now the first term on the right-hand side of the above equation vanishes 
since u k is the tangent vector to a geodesic. In the second term use the 
identity (5.23) to replace u l . m v m by v l . m u m . This term then cancels out 
the second term on the right-hand side of Equation (5.25). Thus we get 
zero on the right-hand side. This is the equation of geodesic deviation. 

The appearance of the Riemann tensor is an indication that we are 
looking at the effect of curvature. We will have occasion to return to this 
equation in the context of gravitational effects on motion. 


5.4 Concluding remarks 

This is our introduction to non-Euclidean geometries insofar as they 
relate to general relativity. In the following chapter we will discuss 
spacetime symmetries. It may be somewhat mathematical. Those impa¬ 
tient for applications to physics may wish to skip it and go to Chapter 7. 
The results derived in Chapter 6 do, however, have important applica¬ 
tions to specific problems. 


Exercises 

1. The Ricci tensor of a four-dimensional spacetime manifold satisfies the 
condition 


Rik = fgik- 


Deduce that / = constant. 

2. A vector field £,■ satisfies the equations 

hi\k + Hk-.i — 0. 


Deduce that 


Hl,ik — Rlikm^"' ■ 
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3. In the spacetime whose metric is given by 

dr = e^dx 4 ) - e”^(x 2 ) 2 (dv 3 ) 2 - e A {(dx') 2 + (dx 2 ) 2 }, 

where <p and X are functions of x 1 and x 2 only, show that, provided R ik = 0, for 
i = k — 0, 

d 2 (j> 3 2 <p 1 d(j> 

-— 3 -— 3 -— = 0 . 

(3x*) 2 (3x 2 ) 2 x 2 3x 2 

4. Write down the equations of a null geodesic in the spacetime given by the 
line element 


dr = dr 2 - 2e T ' At dx 2 - (dx 1 ) 2 3- ^e 2 *'(dx 2 ) 2 - (dx 3 ) 2 
and show that the following is a first integral of them: 


2x 2 


dx 1 

dX 


2\2 2x l 


3 + ) e' 


dx- 2 

dX 


[(x 2 )V‘ 3- 2e~ x ' 


dr 

— = constant, 
dX 


where X is an affine parameter. 

5. Two metrics gf k and g 1) lk on a given spacetime give the same geodesic 
curves. Show that their respective Christoffel symbols and r t2 kl satisfy a 
relation of the form 


r( « - r< « = v i + V V k , 

where V k are the components of a vector. 

6. A vector is parallely propagated round a spherical triangle ABC. Show that, 
at the end of the round the vector makes an angle (A + B + C — n) with its 
original direction. 

7. Consider the conformal transformation 

gfk = gik^°- 

Show that under such a tranformation the wave equation 

□d> 3— R<p = 0 
6 

remains invariant. (Here R is the scalar curvature.) 

8. Show that, under a conformal transformation, the WeyJ tensor 

Ciklm = Riklm ~^(§ikRlm gimRkl gklRim 3 “ glut Rtk) 

1 

+ yigilgkm gimgkOR 

6 

is invariant. Deduce that, if the metric is conformal to the flat spacetime metric, 
the Weyl tensor vanishes. 
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9. For the metric 

dr = c 2 dr 2 — dr- 2 — r 2 { dd 2 + sin 2 0 dip 2 ) 
verify that R ik i m — 0. 

10. Show that, if a geodesic is timelike over a finite part, then it is timelike 
throughout. 

11. Show that, if g ik = g ik exp f, then 

Rik = Rik + 2 (£jk — K'jKik) + &jt(CIf + 2f ; /f’ ( ). 



Chapter 6 

Spacetime symmetries 


6.1 Introduction 

In Euclidean geometry or in the pseudo-Euclidean spacetime of special 
relativity, the geometrical properties are invariant under translations and 
rotations. The same is not necessarily true of the non-Euclidean space- 
times of general relativity. As we shall see in Chapter 8, the spacetime 
geometry is intimately related to the distribution of gravitating matter 
(and energy). A completely general spacetime arising from an arbi¬ 
trary distribution of gravitating objects will not have any symmetries 
at all. Such cases are difficult to solve as solutions of Einstein’s gravi¬ 
tational equations. It is, however, easier to solve problems where mass 
distributions have certain symmetries. For example, a point mass in an 
otherwise empty space is expected to generate a solution that has spher¬ 
ical symmetry about that point. Cases like these may be looked upon as 
approximations to reality. A similar approach is adopted in Newtonian 
gravitation. For example, as a first approximation the gravitating masses 
in the Solar System (the Sun and the planets) are treated as spherical dis¬ 
tributions. In this chapter we will look at certain symmetric spacetimes 
that will be of use in solving specific problems in general relativity. The 
main question that we shall begin with is that of how to identify a sym¬ 
metry in a given spacetime. How do we discover an intrinsic property 
like symmetry, when given the spacetime metric? 

We will have occasion to use symmetric and antisymmetric tensors. 
To facilitate their writing as well as recognition of the nature of their 
symmetry we will write indices (ik) for a symmetric tensor and \ik\ for 
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an antisymmetric one. Thus, if 7^ is any tensor, 



This notation was introduced in Chapter 3. 

6.2 Displacement of spacetime 

It is worth recalling here the stress put on circles as special curves by 
the Greek philosopher Aristotle (384-322 BC). Aristotle argued that the 
displacement symmetry displayed by circles was unique: no other curve 
had it. By displacement symmetry, we mean the following. Take any 
finite arc of the circle and move it so as to place it anywhere else on 
the circle. It will lie congruently on the corresponding part of the circle. 
Because of this symmetry Aristotle felt that circles had a special role to 
play in the behaviour of natural phenomena. 

We will adopt a similar criterion in our specification of the symmetry 
of the a spacetime manifold. Suppose x‘ are the coordinates and g,/. are 
the components of the metric tensor specifying a spacetime manifold 
M. Let P be a typical point with coordinates x£. We may make a ‘copy’ 
of M, called M!, and imagine that M is placed congruently on M!. 

Imagine now an infinitesimal displacement of A4 so that each point 
moves over to a new place. Such a displacement may be described by 
the relation 


(6.1) 


where is an infinitesimal vector field. Equation (6.1) implies that the 
point P with coordinate Xp now moves over to a position that is occupied 
by a point P' in the manifold M! with coordinates xj, + (xp). Figure 6.1 

illustrates this move. 


Fig. 6.1. In the shift described 
in the text the point P of M 
falls on P' of M!. 





n 


6.2 Displacement of spacetime 


A simple example of such a displacement is an infinitesimal trans¬ 
lation or a rotation. In the three-dimensional Euclidean space we can 
consider the rotation of a spherical surface about its centre. In Figure 
6.2, the point P after rotation moves over to P'. However, under such a 
displacement the new surface is indistinguishable from the old one. We 
now ask the following question: what should be the condition on £' for 
this to happen in the displacement given by Equation (6.1)? 

To find this condition let us consider the two spacetimes in the 
above problem. The point P' of AT coincides with the point P of Ad 
(see Figure 6.1). Since the coordinate system was carried along when P 
was displaced to its new position, P continues to have coordinates x‘ in 
Ad. P', on the other hand, has coordinates x l ? + $'(xp) in Ad'. Suppose 
in Ad' we now introduce a new coordinate system given by 

x H =x‘ -f. (6.2) 


Under this transformation P' in Ad' will have coordinates x" = Xp, the 
same as the coordinates of P in Ad; and this must be true for all the 
coinciding points of Ad and Ad'. But what about the spacetime metric 
at the corresponding points? 

The metric tensor at P in Ad is g,k(x p). The metric tensor at P' in the 
old coordinate system was gik(x[ + ${>), where fp = f z (xp). In the new 
coordinate system this is transformed to 


dx‘ 

_dx' m 


dx k - 
dx m . p/ 


gikix p + §p). 


(6.3) 


Since % l is infinitesimal we can use the following approximations which 
ignore errors of second and higher order in and its derivatives: 


dx' 

dx' m 


= ai + r, 


gik{x + £p) — gikix ) + %pgik,l(Xp)' 

Then it is easy to see that, to first order in f', 

g'mn (?) ~ gmu( P) + Wgmn.l + %‘ m gln + %'„glm] P- (6.4) 

From Equation (6.4) we see that Ad and Ad' become geometrically 
indistinguishable at the coinciding points P and P' if the expression in 
square brackets vanishes. Since P is any typical point of Ad this relation 
must hold everywhere. Thus must satisfy the set of equations 


87 


N 



S 


Fig. 6.2. A rotation of the 
sphere about the NS polar axis 
takes point P to P'. 


gmn.l + %\ m gln + „glm ~ 0 . 


(6.5) 
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Using the Riemannian affine connection, these equations can be rewrit¬ 
ten as 


— 0 . ( 6 . 6 ) 

Equations (6.5) or (6.6) are known as Killing’s equations and the 
vector field §' is known as a Killing vector field. In general, for a 
heterogeneous spacetime a non-trivial solution of (6.6) will not exist. 

If the spacetime does admit a Killing vector field we can consider its 
displacement under (6.1). Then, as shown above, the displaced space- 
time is indistinguishable from its original state (vide the example of 
the sphere under rotation). The existence of such a displacement is an 
indicator of symmetry. A displacement of this type is often referred to 
as an isometry. Aristotle had precisely this concept in mind in his choice 
of circles. 


Example 6.2.1 In terms of the spherical polar coordinates 9, tp the line 
element on the surface of a unit sphere is given by 

ds 2 = d 9 2 + sin 2 $ dtp 2 . 

The Killing vector = (§ 9 , satisfies Equations (6.5) above, which are 
explicitly as follows: 

3? s 9£* „ 

= 0 ’ (ll) ^r + sme ^ = 0 ’ (m) a* +cote * =°- 

From (i) we get = f(tp), where / is an arbitrary function of (p. Then 
(ii) gives 

sin 2 # i-e., £* = /'(0)cot6» + g(cp), 

o9 

where f'(<p ) = d//dtp and g(tp) is an arbitrary function of tp. On substituting 
for f 0 and i-* in (iii) we get 

g’(4>) + [f"(<p) cot 9 + f(tp) cot 9\ = 0. 

Since this must hold for all 9 and tp, we have 

g'(<P) = 0 , /"«>)+ /(</>) = 0 . 

Thus the most general solution of the Killing equations in this case is 

f(tp) = A sin tp + B cos tp, g(tp) = C; 

£ 9 = A sin tp + B cos tp, ^ — (A cos tp — B sin tp) cot 9 + C, 

where A, B and C are arbitrary constants. Thus there are three linearly 
independent Killing vectors. 
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6.3 Some properties of Killing vectors 

We now discuss some general properties of the Killing equation and its 
solutions. 

Integrability. Using the formulae (5.4) and (5.6), we at once deduce 
a simple consequence of Equation (6.6): 


2^m;np — (^m:np m;pn ) “f ((f p;nm ^p;tmi) “f (£ n;pin Hn',mp )> 


i.e., 


Hm;np — B pmn^l • 


(6.7) 


From Equation (6.7) we see that, if and its derivatives are known 
at a typical point P, we can determine all higher derivatives of §/ at 
P and hence the entire function f/ in a neighbourhood of P, by Taylor 
expansion. Thus, provided that Equations (6.6) and (6.7) have a solution, 
we can formally write it in the form 

Ux‘) - A n m (X, Pfe(P) + B m pq (X, P)? M (P), (6.8) 


where X is a general point and the quantities A'\ n and B m pq depend on the 
global properties of spacetime, i.e., on g mn , and on the points P and X. By 
virtue of Equation (6.6), B pq — —B qp . In a spacetime of n dimensions 
there are up to n independent quantities §„(P) and up to \n(n — 1) 
independent quantities £p ;? (P) because of the antisymmetry implied 
by the Killing equations. Thus there are in general up to n + \n(n — 
1) = \n(n + 1) linearly independent Killing vectors in a spacetime of n 
dimensions. 

What are the conditions for Equation (6.7) to be integrable? From 
Equation (6.7) we get 




m'.npq — B pnin,qHl T B pnin^l\q 


-6 


m'.nqp B qmn;p^l T B q 


Taking the difference of these and using Equation (5.4), we get 

Hm\npq Hm,nqp — B mpq^I;n T B npq^mj- 


From this follows the result 


?/(*' qmn;p R pmn;q ) R qmnHl',p ^ pmn^l;q 

~ B mp q%I;n ~ B npq ^ m ;l — 0 . 


(6.9) 


These are the conditions for integrability, which by relating £/ and $/ ;m 
impose restrictions on how many Killing vectors can exist at a given 
point of spacetime. 
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Example 6.3.1 Problem. I-' is a timelike Killing vector and u' = with 
0 so chosen that u‘ is a unit vector. Show that 

3 

Solution. On writing £,■ = we get — Ui<t>.k4> 7 ■ From 

the Killing equations we get for the above relation 

+ «*;/) - = 0 . 

Multiply by u‘ and use u'u,- = 1 and = 0. Then 
(f)lt U k \ = (p.k T U ll^cf)j. 

Multiply by u k and use u k u k j = 0 to get u'4>j = 0. Hence the above equation 
becomes 

4>u u k;i = (j) k . 

The result to be proved follows. 

Problem. If § ! is a Killing vector field and T lk is a symmetric tensor satisfying 
the condition T. ,k = 0, then the vector p‘ = T lk % k has zero divergence. 

Solution. We have that p\. = T% + T ik ^ u = T‘% ; , = !(&.,• + = 

0 by virtue of the symmetry of T' k and the Killing equations. We have used 
the property that T lk A ik = 0 if T lk is symmetric and A ik is antisymmetric. 


Finite displacement. The above analysis of Killing vectors relates 
to infinitesimal displacement. In a special case it is possible to talk of 
a finite displacement. This is the case when all ga- are independent 
of a particular coordinate, say x°. Then direct substitution into (6.5) 
immediately shows that 


?'= (0,0,0, e), (6.10) 

where e is an infinitesimal constant, is a solution. This means that a 
displacement of the form 

x°^x° + e, x^ —y x* 1 (6.11) 

leaves the spacetime invariant. 

If x° is a timelike coordinate we say that the spacetime is static. 
When Equations (6.10) and (6.11) hold, we need not restrict e to be 
infinitesimal. As is obvious, by a superposition of a series of infinitesimal 
displacements we can make up a finite displacement that leaves the 
spacetime invariant. 
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Relation to geodesics. If u‘ is a tangent vector to a geodesic £ and 
£; is a Killing vector, then 

t-iU l — constant along f. (6.12) 

The proof follows from the use of the geodesic equation (5.19) and the 
Killing equation (6.6): 

ll‘ (iffcU^);/ = i;kU’ -p u k u‘$ k; , = 0. 

We shall use this result in later work. Equation (6.12) represents a first 
integral of the geodesic equations. 

Example 6.3.2 Problem. In the line element 

d? 2 = e 1 ' dT 2 - e A dtf 2 - R 2 (dd 2 + sin 2 <9 dip 2 ) 
show that the timelike geodesic has the first integrals 

„ dr , . dip 

e 4— = constant; R~ sin 9 — = constant, 
ds ds 

Here v and A. are functions of R only. 

Solution. Since the metric is independent of T and ip, we deduce that 
the spacetime has Killing vectors § i(1) = (1,0, 0, 0) and §' (2) = (0, 0, 0, 1), 
where x' = (T. R, 6 , <p). Hence we have two first integrals of the geodesic 
equations: 

— constant; = constant. 

The first one gives w 0 = constant. Since u° = dT/ds, we get e 1 ’ d T/ds = con¬ 
stant. The second relation likewise gives « 3 = constant. With u 3 = dip/ds, 
we get R 2 sind dip/ds = constant. 


6.4 Homogeneity and isotropy 

The physicist often refers to the above two properties of space and time. 
Of these, homogeneity implies the fact that the physical quantity he 
measures is the same at any two points P and Q in spacetime. Isotropy at 
a given point P implies invariance with respect to a change of direction 
at P. With the help of Killing vectors it is possible to express these 
properties more formally and precisely than the above statements. Since 
we might not always want the entire spacetime M to be homogeneous 
and/or isotropic, I shall consider below these properties in a spacetime 
M„ of n dimensions that is a subspace of M. 

Homogeneity. The spacetime M n is said to be homogeneous if there 
are infinitesimal isometries that carry a typical point P to any point P' 
in its immediate neighbourhood. This means that the Killing vectors 
at P can take all possible values, and we can choose, at P, n linearly 
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independent Killing vectors. By a suitable choice, we can therefore have 
a basis of n Killing vector fields P) at a general point X in the 

neighbourhood of P such that 

lirn f®(X, P) = 5*, (k = 1. n). (6.13) 

Clearly, by a succession of infinitesimal displacements we can take P to 
any distant point P'. 


Example 6.4.1 The surface of the unit sphere is homogeneous because 
at each point it has two linearly independent Killing vectors. As shown in 
Example 6.2.1, there are three Killing vector fields on this surface, so at a 
general point we can choose any two of them. 


Isotropy. The spacetime M n is said to be isotropic at a given point P if 
there are Killing vectors £,• in the neighbourhood of P such that (P) = 0 
and f,' ; /t(P) span the space of antisymmetric second-rank tensors at P. 
Thus we need \n{n — 1) linearly independent % i;k at P. In an isotropic 
spacetime at P we can choose coordinates in the neighbourhood of P 
such that there are \n(n — 1) Killing vector fields £p 9 -*(X, P) with the 
properties 

^ pq] (X, P) = -^ qp] (X, P), 

t/ M] (P,P) = 0, (6.14) 

$#* ] (P. P) = [ftf/'lX, P)]x=p = S[S“ k - 8 q 8 p k 
(p,q = !,...;«). 


Example 6.4.2 Consider the same example of the surface of the unit 
sphere. In this case n = 2, i.e., n(n — 1 )/2 = 1. Since we have seen that this 
surface is homogeneous, we can take P to be the pole (9 — 0) without loss 
of generality. At this point the Killing vector field = 0, i-* = 1 shows 
isotropy. The coordinates are x 1 = 9 sincf), x 2 = 9 cos 4> near the pole 
(where sin 6 = 6). In these coordinates the Killing vector field has covariant 
components 

We may define = —|f 121 . Then Equation (6.14) follows. 


Theorem. Any IA„ that is isotropic about every point is also homo¬ 
geneous. 

Consider Killing vectors %j pq \x, P) and f/ p?] (X, Q) that satisfy 
(6.14) at two neighbouring points P and Q, respectively. At point 
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X, Q) — §/ p<7] (X, P) is also a Killing vector. On writing the 

coordinates of P and Q, respectively, as x P and x' P + &x P , we see that 


lim _{^ ] (X,Q)-^ ?J (X,P)j = 

8xp—>0 OXp 


-Ipql, 


3§ ( M (X, P) 

dxp 


(6.15) 


is also a Killing vector at X. However, we also have, from Equation 
(6.14), 


$» ] (P,P)= lim 


>-[pq] 


bXp 


(Q.P) 


= $Sl - S*8l 


On putting X = Q in (6.15) we get as Q -> P 


d^ q] (X, P) 

dxp 


~8?8l + 8 q 8Z. 


(6.16) 


The Killing vectors (6.16) obviously span the space of vectors at P. For, 
if a, is any arbitrary vector at P, we can construct a general vector field 
in the neighbourhood of P: 


a, 9£ m (X, P) 

MX)=— y 
n — 1 dx k 

which is such that |,(P) = a, (arbitrary constants). This follows from 
Equation (6.16): 


?/(P) = —M&t - 8*8‘ k ) = a,. 

n — 1 

Thus any vector at any arbitrary point P can be expressed in terms of the 
Killing vector fields at P. This proves the result. 

Maximally symmetric spacetime. If M. n is homogeneous and 
isotropic, it is said to be maximally symmetric. From the above theo¬ 
rem, if A4 n is isotropic at every point, it is maximally symmetric. 

A maximally symmetric space has \ n(n + 1) different Killing vector 
fields. To see this we consider the set of vector fields §(X, P) and 
%l Pq \x, P). Suppose they satisfy a linear relation 

a**/®(X, P) + p [pq £\ pq \x, P) = 0, 


at all points X where the as and /Is are constants. On setting X = P and 
using Equation (6.14) we get 


ctkSf = cti = 0 . 

Next, by differentiating with respect to x k and setting X = P, we get 

- *?*') = 2 Pm = o. 
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Hence these Killing vectors are linearly independent and the result fol¬ 
lows. The maximally symmetric space has n Killing vectors for homo¬ 
geneity and \n(n — 1) Killing vectors for isotropy. 


6.5 Spacetime of constant curvature 

We now obtain an important result for maximally symmetric spaces, 
which makes their explicit determination possible. In Equation (6.9) we 
have the general integrability condition. On applying it to the vectors 
£ [p? | (X, P) at P we get, using Equation (6.14), 


m - - S‘S k R' rmn - 8jS„R mr , - S}S k m R l nrt ] = 0. 


The above equation can be simplified further. We get 


R P mJ q r - R p rm 8l - R P mrtZl ~ R \rtK 

= - KmX - Rq mrtK ~ ( 6 - 17 ) 

If we now consider Equation (6.9) for the vectors £'(X, P) and use the 
above relation at X =P, Equation (6.9) reduces to 


pi — pi 

qmn;p pmn,q' 


(6.18) 


On putting q — r in Equation (6.17) and using the symmetry properties 
of Riki m , we get 

R ptmn 71 R nl pm RjntSpn )• (6-19) 

n — 1 

On multiplying further by g tn we get 

R pm = —gpm- ( 6 . 20 ) 

n 

(In these reductions we have to use the relation g"' = n. For the four¬ 
dimensional spacetime we have n = 4.) 

A spacetime satisfying Equation (6.20) is called an Einstein space. 
The maximally symmetric space, on the other hand, has more symme¬ 
tries than in the Einstein space. Substitution of Equation (6.20) into 
Equation (6.19) gives 


Rntmn — 


R 


n(n — 1) 


(grit gpm gmtgpn )• 


( 6 . 21 ) 


By taking the divergence of Equation (6.21) and using Equation (5.13) 
(for n -dimensional spacetime), we see that for n > 3 


R = constant = n(n — 1 )K (say), 


(6.22) 



6.5 Spacetime of constant curvature 95 


where K is a constant. Equation (6.22) then tells us that the spacetime 
Riemann tensor has the form 

Rptmn = K(gnt£pm gmtgpn )■ (6.23) 

In differential geometry this is known as the curvature tensor for a space 
of constant curvature K. 

In the case n = 2, Equation (6.19) can be used to arrive at the same 
conclusion. 

It can be shown that spaces of constant curvature are essentially 
unique. In other words, if we have two spacetimes M n and M! n , with 
Equation (6.23) holding in A4„ and 

Ktmn = K (g',„g'pm ~ g'mtg'pn ) ( 6 - 24 ) 

holding in M' n with the metric tensor g' ik , then there exists a coordinate 
transformation x' —>■ x" for M„ —> M! n that will take gu\ to g' ik in the 
usual manner of tensor transformations. 

The proof of this result will not be given here, for want of space 
(see Reference [7] for details). Using this result, however, it becomes 
easy to identify maximally symmetric spacetimes in n dimensions. The 
essential difference is in the sign of K, since the magnitude of K can be 
scaled by a suitable scale transformation of d.v. We shall, later on, need 
the cases for which all the n dimensions are spacelike. In this case we 


have the following three line elements: 



*■ = -*-{«*>■+J} 

(K > 0), 

(6.25) 


(K < 0), 

(6.26) 

dy 2 = -(dx) 2 (K = 0). 


(6.27) 


Here x = (xx") is the coordinate vector and x 2 is the square of 
its magnitude. It can be verified that these spaces do satisfy Equation 
(6.23) so that, by virtue of the uniqueness theorem, they contain all the 
required information about homogeneous and isotropic spaces. 
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6.6 Symmetric subspaces 

In general the entire spacetime might not have many symmetries, but 
it may have subspaces with more symmetries. In particular it may have 
maximally symmetric subspaces. Although our eventual application will 
be to the (3 +1 )-dimensional spacetime we will continue to discuss 
m -dimensional subspaces in an ^-dimensional (n > m) spacetime M n . 

Suppose {<p m } is a collection of subspaces within M n . We will 
choose a coordinate system x' such that x l , denote different 

points on the same <p m , while the remaining coordinates x m+1 , for 

these points are the same. In other words, the variation of (x m+1 ,..., x”) 
denotes different numbers of {(p m ) while the variation of x 1 ,..x m 
represents the variation on a given <p m . 

We say that the spaces <p m are homogeneous in M n if there exist 
at least m linearly independent Killing vectors that take any point P on 
<p m into any other given point P' on (p m while leaving <p m as a whole 
invariant. 


Example 6.6.1 The rotation of the 2-sphere about its centre in the (3 + 1 )- 
dimensional spacetime of special relativity. Here is the surface of the 
sphere. 


A similar definition can be given for isotropy of (p,„. Of particular 
interest is the case in which (p m is maximally symmetric. In this case there 
exist \m(m +1) independent Killing vectors, each with the following 
property. For an infinitesimal displacement of the type 


x‘ —> x' + f', i = 1,..., m, 
x l -*■ x‘, i > m. 


(6.28) 


the whole space M n is unchanged. The therefore have zero compo¬ 
nents for i > m, although they can be functions of all x'. The linear 
independence of all the \m(m + 1) different §' implies therefore that 
there is no linear relation among them with coefficients depending on 
x l ,...,x m . 

It can be then be shown (see Reference [7] for proof) that the line 
element of A4„ can be written down in the form 

dr = f(x m+ \ .. ., x") h ik (x\ ..., x m )dx‘ dw fc 

hl - m (6.29) 

+ Y^(x m+l ,...,x")dx i dx k . 

i,k>m 

We will consider two special cases of the above result applicable to the 
(3 + l )-dimensional spacetime. 
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Spherically symmetric spacetime. In this case there are two- 
dimensional surfaces ^2 of constant positive curvature, concentric about 
a fixed point O at all times. We may choose x x =6, x 1 = to denote the 
coordinates on <p 2 , and x 3 = r, x 4 = t, to denote the variation among 
the {<£> 2 } family. Then, from the above result (6.29), the line element has 
the form 


ds 2 = A(r, t)dt 2 + 2H(r, t)dt dr- + B(r, t)dr 2 
+ F(r, t){dd 2 + sin 2 0 d^> 2 }, 


(6.30) 


where A,H,B and F are general functions. We shall need this spacetime 
in later work to describe the gravitational field of a spherically symmetric 
distribution of matter and energy. 

Cosmological spacetimes. In this situation, there is a family of three- 
dimensional maximally symmetric spacelike subspaces {^ 3 }. We choose 
x° — t and use x 1 , x 2 , x 3 to denote points on any ^ 3 . On ^3 we use the 
metric (6.25)-(6.27) depending on the sign of K. All three cases can be 
represented by a compact line element: 

r dr 2 1 

dr = dt 2 — S 2 (t ) -— + r 2 (dd 2 + sin 2 0 d0 2 ) , k = 0, +1. (6.31) 

. 1 — kr L 

Note that the function goo in Equation (6.31) can be made unity in this 
case because it can be absorbed in a pure time transformation t —»■ t'. 
Thus, if we start with t', then we can choose t such that 

g m {t')dt a = dr. 
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It is easy to verify that a solution to these equations is given by 
?° = 1, = —Hr. 

The fact that §° ^ 0 suggests that a symmetry with translation in the time 
direction is possible. The vector is timelike for H 2 r 2 e 2Ht < 1. 


Exercises 

1. Show that the Godel universe given by the line element 

ds’ 2 = (dr 0 ) 2 + 2e T ' dr° dr 2 - (dr 1 ) 2 + ^‘(dc 2 ) 2 - (dr 3 ) 2 
has the following Killing vector fields : 

(1.0, 0,0), (0,0,0, 1), (0,0, 1,0), (0, l,-r 2 ,0), 

(-4e“*', 2r 2 , -(r 2 ) 2 + 2 e - 2x ', 0). 

Is this spacetime (i) homogeneous, (ii) isotropic and (iii) stationary? 

2. Show that the subspaces t = constant of the Heckmann-Schiicking spacetime 

dr 2 = d? 2 +2e A ' dtdr 2 — Cn(t)((dr 1 ) 2 + cee 2x ' (dx 2 ) 2 } 

— 2 c 1 2 (f)e T ' dr 1 dr 2 — c 33 (t)(dr 3 ) 2 
are its homogeneous subspaces. 

3. Show that the integrability condition (6.9) can be written in the form 

mh p _ h p»i t P ,H t D ln t D m 

S Kijkl;m — Sm jJi Mj Sm;k K Uj Sm;i K jkl + Smy K ikl- 

Deduce, by multiplication by g ,k or otherwise, that 

r Ril* = + Smu R r■ 

4. Find ten independent Killing vectors for the Minkowski spacetime. 

5. Show that any Killing vector £' satisfies the equation 

□£' + R\H k = 0. 

6. If T ,k is the energy momentum tensor and is a timelike Killing vector, 
show that the integral 

j ne ds 

over the whole spacelike hypersurface is independent of the choice of the hyper¬ 
surface. 

7. Show that the line element (6.31) for k = +1 is manifestly conformal to the 
Minkowski line element through the following series of transformations (due to 
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L. Infeld and A. Schild): 

du 1 

r = smR ' T = J WY ^2 {T + Rl 


r,= l -(T-R); 


X = tan f, Y — tan iy, 


r = -(X+Y), 


p = -(X-Y). 


What transformations will do the same for the case k = — 1? 



Chapter 7 

Physics in curved spacetime 


7.1 Introduction 

Having acquainted ourselves with the trials and tribulations of working 
in non-Euclidean spacetimes we are now prepared for the next step, that 
of describing physics in such curved spacetimes. For we recall from 
Chapter 2 that the Einstein programme for general relativity consists 
of replacing the Newtonian perception of gravitation as a force by the 
notion that its effect makes the geometry of spacetime ‘suitably non- 
Euclidean’. What we mean by ‘suitably’ will be clear in the next two 
chapters. But given that the geometiy is non-Euclidean we first need to 
know how the rest of physics is described in it. 

For example, how do we describe the motion of a particle under 
a non-gravitational force? How do we write Maxwell’s equations? 
What is the role of energy-momentum tensors? Can a dynamical action 
principle be written in curved spacetime? Such questions need our 
attention before we turn to the basic issue of how gravity actually leads 
to curved spacetime. 

To this end we will introduce a concept that Einstein took as a basic 
principle in formulating general relativity. It is known as the principle 
of equivalence. 


7.2 The principle of equivalence 

Let us go back to the purely mathematical result embodied in the 
relations shown in Section 4.6 and attempt to describe their physical 
meaning. These relations tell us that special (locally inertial) coordinates 
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that behave like the coordinates ( t,x,y , z) of special relativity exist in 
the neighbourhood of any point P in spacetime. Physically, these coordi¬ 
nates imply a frame of reference in which a momentary illusion is created 
at P and in a small neighbourhood of P that the geometry is of special 
relativity. The illusion is momentary and local to P because we have seen 
that the relations of (4.15) cannot be made to hold everywhere and at all 
times. 

In view of the assertion made in Section 2.1 that gravitation mani¬ 
fests itself as non-Euclidean geometry, we would have to argue that in 
the above locally inertial frame gravitation has been transformed away 
momentarily and in a small neighbourhood of P. How does this happen 
in practice? Consider Einstein’s celebrated example of the freely falling 
lift. A person inside such a lift feels weightless. The accelerated frame of 
reference of the lift provides the locally inertial frame in the small neigh¬ 
bourhood of the falling person. Similarly, a spacecraft circling around 
Earth is in fact freely falling in the Earth’s gravity, and the astronauts 
inside it feel weightless. (See Figure 7.1 showing an astronaut floating 
in space.) 



Fig. 7.1. A floating astronaut 
in the micro-gravity 
environment of a space 
shuttle. Photograph by 
courtesy of NASA. 
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It should be emphasized that this feeling of weightlessness in a 
falling lift or a spacecraft is limited to local regions: there is no universal 
frame that transforms away Earth’s gravity everywhere, at all times. If 
we demand that the relations of (4.15) hold at all points of spacetime, 
we would need to have dY' kl /dx m = 0 everywhere, leading to R l k]m = 0, 
that is, to a flat spacetime. Thus a curved spacetime with a non-vanishing 
Riemann tensor is necessary to describe genuine effects of gravitation. 

The weak principle of equivalence states that effects of gravitation 
can be transformed away locally and over small intervals of time by 
using suitably accelerated frames of reference. Thus it is the physical 
statement of the mathematical relations given by (4.15). It is possi¬ 
ble, however, to go from here to a much stronger statement, the so- 
called strong principle of equivalence, which states that any physical 
interaction (other than gravitation, which has now been identified with 
geometry) behaves in a locally inertial frame as if gravitation were 
absent. For example, Maxwell’s equations will have their familiar form 
in a locally inertial frame. Thus an observer performing a local experi¬ 
ment in a freely falling lift would measure the speed of light to be c. 

The strong principle of equivalence enables us to extend any physical 
law that is expressed in the covariant language of special relativity to 
the more general form it would have in the presence of gravitation. The 
law is usually expressed in terms of vectors, tensors, or spinors in the 
Minkowski spacetime of special relativity. All we have to do is to write 
it in terms of the corresponding entities which are covariant in curved 
spacetime. Thus, in the flat spacetime of special relativity, the Maxwell 
electromagnetic field tensor F ,k is related to the current vector j k by 

F% = 47tj k . (7.1) 

In curved spacetime the ordinary derivative is replaced by the covariant 
derivative: 

F%=4nj k . (7.2) 

Notice that the effect of gravitation enters through the V l kl terms that 
are present in (7.2). This generalization of (7.1) to (7.2) is called the 
minimal coupling of the field with gravitation, since it is the simplest 
one possible. 

So, in order to describe how other interactions behave in the presence 
of gravitation, we use the covariance under the general coordinate trans¬ 
formation as the criterion to be satisfied by their underlying equations. 
Thus, it is immediately clear from the example of the electromagnetic 
field that a light ray describes a null geodesic. 

In the same vein we can now describe a moving object that is acted 
on by no other interaction except gravitation - for example, a probe 
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moving in the gravitational field of the Earth. In the absence of gravity, 
this object would move in a straight line with uniform velocity; that is, 
with the equation of motion 


where u‘ is the 4-velocity. In the presence of gravity, (7.3) is modified 
to our geodesic equation (5.19). 


7.3 A uniformly accelerated frame 

We now describe another example that provides a clue about how grav¬ 
itational effects show up in spacetime geometry according to general 
relativity. Consider the Minkowski spacetime with the standard line ele¬ 
ment 


ds- 2 = c 2 At 2 — dv 2 — dy 2 — dz 2 . (7.4) 

If we make the coordinate transformation for a constant g, 

x = — ^cosh ^ j - t) + x'cosh , (7.5) 

where 


y = /, 
/ 

Z — z , 



This leads to the line element 

di’ 2 = ^1 + dt' 2 - dx' 2 - dy' 2 - dr' 2 . (7.6) 

What interpretation can we give to (7.6)? The origin of the (x',/, z') 
system has a world line whose parametric form in the old coordinates is 
given by 



Using the kinematics of special relativity described in Section 1.8 
of Chapter 1, it can be easily seen that (7.7) describes the motion of 
a point that has a uniform acceleration g in the x direction, a point 
that is momentarily at rest at the origin of (x,y,z) at t = 0. We 
may interpret the line element (7.6) and the new coordinate system as 
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describing the spacetime in the rest frame of the uniformly accelerated 
observer. 

Direct calculation shows that not all T' kl are zero in (7.6) at x' = 0, 
y' — 0. z' = 0. The frame is therefore non-inertial. For the neighbour¬ 
hood of the origin, the metric component 


^ 2 ex' 2<j> 

goo = 1 4- y~ = 14— y . 

c 1 c 1 


(7.8) 


where <p is the Newtonian gravitational potential for a uniform gravita¬ 
tional field that induces an acceleration due to gravity of —g. We have 
here the reverse situation to that of the falling lift: we seem to have gen¬ 
erated a pseudo-gravitational field by choosing a suitably accelerated 
observer. The prefix ‘pseudo-’ is used because the gravitational field is 
not real - it is an illusory effect arising from the choice of coordinates. 
The Riemann tensor for the metric is zero, thus confirming the above 
statement. 

An example of an accelerated frame is provided by a bus or an 
aircraft starting off from rest. All passengers facing in the forward 
direction feel a force pressing their backs to their seats. This force 
‘attracting’ them to the seats is illusory and momentary, lasting only so 
long as the acceleration persists. Astronauts taking off in rocket-driven 
spaceships feel their weight increase several times at the time of lift off, 
again because of the initial acceleration. All these examples tell us how 
intimately related the accelerated frames are to gravity. 

Nevertheless the relation (7.8) is also suggestive of the real grav¬ 
itational field, as we shall see in the following example and later in 
Chapter 8. 


Example 7.3.1 Consider a particle held at rest at the origin x = 0, y = 
0, z = 0 in the manifestly Minkowski frame (7.4). What is its trajectory in 
the uniformly accelerated frame (7.6)? 

On setting x = 0 in Equation (7.5), we get, 



which, for small t', i.e., for t' <§; c/g, approximates to 



Thus to an observer at rest in the accelerated frame, the particle will 
appear to have a ‘free fall’ in the negative x' direction, and the observer will 
ascribe this to gravity in that direction. 
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Example 7.3.2 Problem. A particle of unit rest mass is uniformly accel¬ 
erated as described in Section 7.3. Show that, at time t, its energy has grown 
to y(t)c 2 , where 


y(t) = 





Solution. Without loss of generality we take the particle at x' = 0. Using 
(7.7) we get dr/ds = coshfg/'/c). We may identify df/ds with the energy 
of motion per unit mass. That is, y(t) = cosh(gf'/c). Using (7.7) again to 
relate t' to t, we get 



—(?) 



From this we get the required result 


g 2 t *V /2 


yif) = i + V 


7.4 The action principle and the 
energy-momentum tensors 

Let us now see how we can write the laws of physics in the covari¬ 
ant language in a Riemannian spacetime using the strong principle of 
equivalence. We take the familiar example of charged particles interact¬ 
ing with the electromagnetic field. The physical laws can be derived from 
an action principle. First we write the action in Minkowski spacetime: 



Here we assume that the action describes physics in a volume V of 
spacetime bound by surface E. All variations of physical quantities are 
supposed to vanish on E. A t are the components of the 4-potential, 
which are related to the field tensor Fa by 

Ak,i — A uk = F ik , (7.10) 

while e a and m a are the charge and rest mass of particle a, whose 
coordinates are given by x' a and the proper time by s a with 

ds* = r) ik dxj dx a k . (7.11) 

How do we generalize (7.9) to Riemannian spacetime? First, we note 
that Tja in (7.11) are replaced by ga- Next, starting from the covariant 
vector A,, we generate Fa by the covariant generalization of (7.10): 


Akj — A i; k = Fik- 


(7.12) 
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However, since the expression (7.12) is antisymmetric in (i, k ), the extra 
terms involving the Christoffel symbols cancel out and we are back to 
(7.10)! The volume integral in (7.9) is modified to 

J F ik F ik *f—g d 4 x. (7.13) 


The extra factor v /—g has crept in because the combination 


*J—g dx 1 dx 2 dv 3 dx° = —Sijki dr' dx j dr* dx 1 (7.14) 

acts as a scalar. (Refer to Example 3.3.4.) We therefore have the following 
generalized form of (7.9): 


A =-^ cm a 


' ds„ - 


167TC 


'F it F ,l V=i<?x - y - 


Aidxj. (7.15) 


The variation of the world line of particle a gives its equation of motion 


d 2 x ' . dv * dv 1 e„ . da 1 

a | pt a a _ a pi 

ds 2 kl ds’a d?a m a 1 ds’a ’ 


(7.16) 


while the variation of A t gives the field equations (7.2). 

We summarize the situation by stating a general rule. Whatever 
variables we introduce to specify the dynamics of the observed situation, 
we apply the principle of stationarity of action for small variations of 
these variables so that we end up knowing the ‘equations of motion’ that 
specify how these variables change over space and time. 


7.5 Variation of the metric tensor 

The transition from (7.9) to (7.15) has, however, introduced an additional 
independent feature into the action, besides the particle world lines and 
the potential vector. The new feature is the spacetime geometry typified 
by the metric tensor gy*. We argue, quite plausibly, that the entire problem 
is specified not just by the dynamical and field variables, but also by the 
spacetime geometry. What will happen if we demand that the gy* are 
also dynamical variables and that the action A remains stationary for 
small variations of the type 


ga gik + S&t? (7.17) 

From the generalized action principle, should we not expect to get the 
equations that determine the ga, and through them the spacetime geom¬ 
etry? Let us investigate. 

A glance at the action (7.15) shows that the last term does not 
contribute anything under (7.17) if we keep the world lines and A t fixed 
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in spacetime. The first two terms, however, do make contributions. Let 
us consider them in that order. First note that 


5(ds^) = 5 g ik AxJ Ax a k , 


that is, 


S(d s a ) = 


1 dr ' dx * 
-bgn-^-^As a . 

2 di’a di’a 


Therefore, 


6 E 


cm a 




dr * dr k 

m a -s- - 2 - ds a 6 g ik . 
d.v„ dJ„ 


(7.18) 


Let us consider this variation in a small 4-volume V near a point P. 
If we look at a locally inertial coordinate system near P we can identify 
the above expression in a more familiar form. Let us first identify 


dr '■ 

PU - cm a 


as the 4-momentum of particle a. Then c/> ( ° a) = E a is the energy of the 
particle, and we get 


1 Ax' Ax k c 2 . , c . 

I™* dj ^ = 2E a P '^ At ° = 2E/^ 


dx" 


Figure 7.2 shows the volume V as a shaded region in the neighbour¬ 
hood of P, t being the local time coordinate and x l ‘ (ji =1,2, 3) the 
local rectangular space coordinates. We will shortly discuss the various 
cases described in Figure 7.2. The expression (7.18) can then be looked 
upon as a volume integral over V of the form 



(7.19) 


where T^ t) is the sum of the expressions like 


f P\a)P\a ) 

for each particle a that crosses a unit volume of the shaded region near 
P. We now interpret this sum under various conditions. In each case 
the trick is to look at the problem in the locally inertial frame and then 
transform to a general frame. 


7.5.1 The energy tensor of matter 

This expression for 7}* is none other than the usual expression for the 
energy tensor of matter (also called the energy-momentum tensor or the 
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Fig. 7.2. In (a) we have 
matter particles moving along 
parallel worldlines, with no 
collisions and very little 
relative motion. In (b) we see 
particles moving relativistically 
in random directions, while in 
(c) we have an intermediate (a) 
situation wherein particles 
have small, random, relative 
motions. 


(b) 


(c) 



X <“ 


XV 


XP 


stress energy tensor ). Since we will need this tensor frequently, it is 
derived below for three different types of matter. 


Dust 

This is the simplest situation, in which all of the particle world lines 
going through the shaded region in Figure 7.2(a) are more or less parallel, 
indicating that the particles of matter are moving without any relative 
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motion in the neighbourhood of P. Writing the typical 4-velocity as u‘ 
and using a Lorentz transformation to make u‘ =(1,0, 0, 0) (that is, 
transforming to the rest frame of the dust ), the only non-zero component 
of the energy tensor is 

t °° = m “ c2 =a>c 2 , 

a 

where the summation is over a unit volume in the neighbourhood of P. 
Here po is the rest mass density of dust. In any other Lorentz frame we 
get 

T ik = p 0 c 2 u’u k , (7.20) 

(m) 

an expression that is easily generalized to any (non-Lorentzian) reference 
system. 

Relativistic particles 

This situation, described in Figure 7.2(b), represents the opposite 
extreme. Here we have highly relativistic particles moving at random 
through V. The 4-momentum of a typical particle is then approximated 
to the form 

E :2 = c 2 \ P | 2 + i»V = c 2 P 2 , P = |P|. 

Here m is the rest mass of a typical particle. In the highly relativistic 
approximation we have |P| ^ me. 

Using the fact that the particles are moving randomly, we find that 
the energy tensor has pressure components also: 



a 


The factor 1/3 comes from randomizing in all directions. These are the 
only non-zero pressure components. Here e is the energy density. Thus 
for highly relativistic particles we get 

T ik = diag(e, e/3, e/3, e/3). (7.22) 

(m) 

This form is applicable to randomly moving neutrinos or photons. 

Fluid 

This situation is illustrated in Figure 7.2(c) and consists of a collection 
of particles with small (non-relativistic) random motions. If we choose 
the locally inertial frame in which the fluid as a whole is at rest as the 
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frame of reference, we can evaluate the components of as follows. 
Let a typical particle have the 4-momentum vector given by 


P = 


P» = 


1 - 


(/x= 1,2,3). (7.23) 


1 - 


Then 


- 1/2 


r 00 = 5> c 2 (i-^) + = pc : 


11 _ ^,22 _ ^,33 _ 1 ^ 


»u>- 1 - — = P- 


-1/2 


(7.24) 


Here p and p are the density and pressure of the fluid. In a frame of 
reference in which the fluid as a whole has a 4-velocity u', the energy 
tensor becomes 


T ik = {p + pc Vh* - prf k . (7.25) 

(■ m ) 

The generally covariant form of (7.25) is obviously 

T ik = (p + pc 2 )u i u k - pg**. (7.26) 

(m) 

Note that p is not just the rest-mass density, but also includes the energy 
density of internal motion, as seen in (7.24). 

We may now relax our restriction to the locally inertial coordinate 
system at P. The generalized form of (7.19) is then 

cm a / =-*- I T ,k */=gbg ik d 4 x. (7.27) 

z —' J 2c J (i») 


7.5.2 The energy tensor of the electromagnetic field 

We next consider the variation of the second term of (7.9 ). If we keep 
Aj fixed, the F ik , as given by (7.12) or (7.10), remain unchanged under 
the variation of ga. Hence 

b(F ik F ik ^g) = F ik F lm SfeV"^). 


From the basic definition we get 

!>g ik gk, = -g ik t>gu , 


that is, 




ik 


g im g kn 6ft 


(7.28) 
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Also, from (4.12) we have 

&V^g = ^V-I *>gik- (7.29) 

Substituting these expressions into the variation of the second term of 
the action gives 

— f d 4 x = 1 / T*J=gbg lk tfx, (7.30) 

107T C J 2 C J ( em ) 

V V 

with the electromagnetic energy tensor given by 

7 \m) ik = ^ . (7.31) 

The above two examples can be generalized to any field A that is 
described by an action 

A^jL^x. (7.32) 

Here L A is the Lagrangian density of A. The variation of A a may be 
written as 

M a = --^ f T ik ^gbg ik d 4 x. (7.33) 

2c J (A) 

This may be taken as a formal definition of T ,k , the energy-momentum 
tensor. 

In theories defined only in Minkowski spacetime the appearance of 
energy tensors is somewhat ad hoc. They do not enter explicitly into any 
dynamic or field equations. They appear only through their divergences, 
the typical rule for conservation of energy and momentum being given 
by T lk k = 0. In our curved spacetime framework the T lk find a natural 
expression through the variation of gn-. Moreover, as we shall show next, 
the above derivation of the T' k leads to the zero-divergence equation as 
an automatic consequence. 

7.5.3 Conservation of energy and momentum 

We begin with the observation that L A in the action leading to (7.32) is 
a scalar quantity, so any change of coordinates does not change it. Using 
this result, we make an infinitesimal change of coordinates: 

x’ t =x i + ?, (7.34) 

where the fs are infinitesimally small. Clearly, for such a coordinate 
change, the change 6^4 A in the action will be zero. But we can express 
the change in another way. The coordinate change introduces a change of 
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metric tensor implying that for the same geometry there will in general 
be a non-zero Sg,^. So we will get the change of action as in (7.33). We 
will therefore evaluate it first. For brevity we will denote d%'/dx k by tj k . 
Tensor transformation law will give 


so we get 

ms + n = (s'i - - r k ) x g, m . 

Expand the left-hand side by Taylor expansion around x‘ retaining only 
up to the first-order term in . On the right-hand side likewise retain 
terms of that order only to get 

&gik = g i k (X ) ~ g ik(x ) = —gim^ k ~ glk% ; — gik.lH ■ 

Now convert the ordinary derivatives of £ into covariant derivatives 
by adding the terms with Christoffel symbols and use the identities of 
Section 4.5 to express the derivative gikj in terms of the same symbols. A 
simple manipulation along these lines leads to a result somewhat similar 
to that of Chapter 6 (see Equation (6.6 ) there): 

8 &* = -B/* + $y]. (7.35) 

We therefore get the change in action, using Equation (7.33), as 

Wa = |/ rV=gS & d 4 x (7.36) 

2c J (A) 

= -^~ f rV=gK i; * + &]d 4 .Y. (7.37) 

2c J (A) 

Since is a symmetric tensor, this expression can be further simplified 
and rewritten (after suppressing the suffix A) as 

Ma = J - bT%W=gd 4 x. (7.38) 

Of the two terms inside the square brackets, the first gets transformed 
to a surface integral by Green’s theorem and, since in the variational 
process changes like are supposed to vanish on the boundary, we 
are left with the second term only. Since we expect 6.4 a to vanish for 
arbitrary we conclude that 

T% = 0, (7.39) 

i.e., the energy-momentum tensor is conserved. 

Notice that this result was deduced from the scalar property of the 
action, that is, from its invariance with respect to coordinate transfor¬ 
mation. This is a ‘symmetry’ property of the action and the above result 
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may be seen as an example of the general theorem due to Emily Noether, 
which states that for every symmetry of the action there is a conservation 
law. We encounter several examples of Noether’s theorem in theoretical 
physics. 


Example 7.5.1 Problem. For a scalar field with Lagrangian density 

L = ^<Pj<P.kg ik 

derive the energy-momentum tensor. 

Solution. By performing the variation of ga, g lk , etc. we get 

6 A* = d J j<pj(p, k g ,k v^g d 4 x = ^ J Kg' k V~g) d 4 *. 

Using the result 


5 (g'V-£) = 8^7=1+ g' t 6>/=i 

= 5g'V=l-l > /=sr„6| f «* rt 


we get 


— 


0 , 1 - 0 ,* 


SAFI- xj=gg pq S g p “g ik 


d 4 x 


Tik 5g' A \/-gd 4 x, 


where 

1 , 

Tik = — 2^ ik< ^ 

This is the required energy-momentum tensor for the scalar field. 

Problem. Show that, if the Lagrangian density L of a field explicitly depends 
on ga and ga.i, then the corresponding energy tensor is given by 


rpik _ 2 



dL 

3 gik.i 


dL 

dgik 



Solution. We have, from (7.33) and (7.29), 


s (iV=i) = 


sl 

dgik 


dgik + 


dL 

d gik.i 



V^g + ^L^gg ik &g ik . 


However, dgaj may be written as (S ga),i and one can use the divergence 
theorem to get 


3 L 
dgik,i 


dgik,iV~g d 4 x = - 


dL 

dgik.i 


&ga d 4 x . 
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Here we have used the fact that the variations dg ik vanish on the boundary of 
V, so that the surface integral on the right-hand side vanishes. Hence from 
(7.33) we get 

-1 


1 


1 3 ga, 


3 L 


.—, 3 L 1 ik 

V~g f 7 -t Lg k 

J , 3 gik 2 


The stated answer follows when we recall that the first term on the right-hand 
side contains </= g , and it gives ( -g),i = \j ZI gg mn gnm.i- 


It was this variation of the metric tensor that led Hilbert to derive the 
field equations of general relativity shortly after Einstein had proposed 
them from heuristic considerations. We now turn our attention to this 
topic in the following chapter. 


Exercises 

1. Calculate the energy-momentum tensor for the scalar field d> given by the 
action integral 

J{<!>■,i<t>; k g lk + m 2 <t> 2 )V~g dV 

where m — constant. ( m is usually identified with the mass of the field.) 

2. Fluid with isotropic pressure p and density p fills a spherically symmetric 
region with the line element 

dr = e v dr - e A dr 2 - e^dd 2 + sin 2 <9 d0 2 ), 


where k, p, and v depend on r and t only. From the conservation law deduce the 
relations 


3 2 dp dv 2 dp 

— (A + 2 /x) =-; — =-. 

3 1 p + p dt dr p + p dr 

(The velocity of light is unity.) 

3. Dust of density p(t) and radiation of density u(t) fill the spacetime given by 
the line element 


dr = d t 2 


S 2 (t) 


dr 2 

1 - kr 2 


+ r 2 (dd 2 + sin 2 # d <j> 2 ) 


where k = 1, 0 or —1. From the conservation law deduce that 


1 3 

P dt 


(PS 3 ) + 


1 3 
5? dt 


(i uS 4 ) = 0. 


(The velocity of light is unity.) 


4. Verify by direct calculation that the divergence of the electromagnetic energy- 
momentum tensor vanishes everywhere except at the location of the charged 
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particles. By a suitable limiting process deduce the equations of motion of the 
electric charge by evaluating 


rpik rpik 

>k(m) ’k(em) 


at the particle. 


5. The action Ak is conformally invariant, i.e., it does not change when the 
spacetime metric g ik is changed to £l 2 gi k , where £2 is a well-behaved function of 
x‘ and 0 < £2 < oo. Show that the trace of T' k vanishes identically. (The trace 

(A) 


of a tensor A ,k is g ik A ,k .) 


6. Show that for dust T' k = pu'u k conservation means that u' follows a 
geodesic. 


7. Suppose that in a specific coordinate system the metric g ik is independent of 
x 1 . Show that the conservation law T\. j = 0 for the energy-momentum tensor 
becomes expressible as 


sFF dx 


T (V=gr\) - o. 


(Both T' k and g ik are assumed diagonal.) 

8. Show by direct calculation that Maxwell’s equations are conformally invari¬ 
ant. Work out how masses of electric charges must transform if the Maxwell- 
Lorentz equation of motion also is conformally invariant. 

9. Calculate the form of the energy-momentum tensor for a plane electromag¬ 
netic wave in Minkowski spacetime. 

10. Write down Poynting’s theorem in the older three-dimensional form of the 
electromagnetic theory. Work out its form in the four-dimensional notation of 
special relativity and generalize it to curved spacetime. 



Chapter 8 

Einstein's equations 


8.1 A heuristic approach 

The preceding chapter showed that the variation of the action A with 
respect to ga leads us to the energy tensor of various interactions. We still 
do not have dynamical equations that tell us how to determine the ga- in 
terms of the distribution of matter and energy. It was Einstein’s conjecture 
that the energy tensors should act as the ‘sources’ of gravity. Thus what 
we have so far achieved is identification of sources of gravitation. But 
we further need the basic variables whose sources are these T ik . Einstein 
felt that the variables are not to be found in physics hut in the geometry 
of spacetime. We have already seen that the basic measurements of the 
geometry are carried out through the ga, the concept of parallelism is 
expressed through the T' kl while spacetime curvature appears through 
the Riemann tensor Riki m ■ Einstein reasoned in a heuristic way to arrive 
at equations linking these quantities to the energy-momentum tensors. 
Below we capture the reasoning he used. 

Following the general trend of nineteenth-century physics, especially 
the Maxwell equations, Einstein looked for an expression that would act 
like a wave equation for g ik , with T ik as the source. It is immediately 
clear that the standard wave equation in the covariant form 

g mn gik-.mn=KT ik , (8.1) 

where k is a constant, will not do, for the left-hand side vanishes iden¬ 
tically. In fact any covariant linear combination of the first and second 
derivatives of the metric tensor will be expressed in terms of their covari¬ 
ant derivatives and will vanish because of the identity g ik; i = 0. However, 
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if we go to covariant non-linear expressions involving ordinary deriva¬ 
tives, this need not be so. 

Is there a second-rank tensor symmetric in its indices (like the TV) 
that involves second derivatives of gik in a non-linear form? Does such 
a tensor appear naturally when one studies the geometry of spacetime? 
Clearly, if the tensor is to bring out the special feature of curvature of 
spacetime, it must be related to the Riemann tensor. Einstein first tried 
R ik , writing his equations as 


R jk = constant x 7]*. (8.2) 

In order to ensure that energy and momentum are conserved, he had 
to impose the additional requirement that the right-hand side of these 
equations have a zero divergence. However, after some trial and error 
he improved on this conjecture, finally arriving at the tensor GV of 
(5.11). His field equations of general relativity, published in 1915 (see 
Reference [8]), took the form 

Ra - \sikR = G ik = k T ik . (8.3) 

The constant k is to be determined by the requirement that the above 
equations resemble Newton’s when describing slow motion (v GC c) in 
a weak gravitational field. We will return to this problem in Section 8.3. 

These equations have the added advantage that in view of the Bianchi 
identities in (5.13) all solutions of these equations must satisfy the 
condition 


T% = 0. (8.4) 

That is, the law of conservation of energy and momentum follows natu¬ 
rally from (8.3). 

Although there are ten Einstein equations for ten unknown g ik , 
the divergence condition of (8.4) reduces the number of independent 
equations to six. This underdeterminacy of the problem can be related 
to the general covariance of the theory: if g ik is a solution, then so is 
any tensor transform of g ik obtained through a change of coordinates. 
In short, there is a degeneracy of solutions: several apparently different 
solutions represent the same physical reality. One solution in this set can 
be obtained from another by a suitable coordinate transformation. 

The expression (8.4) follows for any T tk obtained from an action 
principle by the variation of gik as found in the last chapter. As men¬ 
tioned there, this result is an example of Noether’s theorem, which relates 
a conservation law to a basic symmetry. In this particular case the sym¬ 
metry is that of coordinate invariance. It is therefore pertinent to ask 
whether the Einstein tensor can also be obtained naturally by deriving 
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Equations (8.3) from an action principle. This problem was solved by 
Hilbert [9] soon after Einstein proposed his equations of gravitation. 

8.2 The Hilbert action principle 

If we wish to derive the Einstein tensor from an action principle, we 
naturally look for a scalar of geometrical origin that contains up to first 
derivatives of g,i ; . No such scalar exists! However, if we go to second 
derivatives, then the simplest scalar is R. It was therefore taken as the 
starting point by Hilbert for his action principle. 

Hilbert’s problem can be posed as follows. Consider the variation of 
the term defined over a spacetime volume V, 

J RV=g d 4 * (8.5) 

V 

for g lk -> g lk + 5 g ,k with the restriction that 6 g ,k and 8g lk j vanish on 
the boundary of V. We now show that 

5 J RV~g d 4 x = J bg‘ k (^R ik - ^gikRj d 4 x 

V V 

= -J 6®i (V - V^d 4 x. (8.6) 

V 

To show this, first note that, under the variation g,-* —► gik + &gik, 
the variation ST 1 # transforms as a tensor. This follows on applying the 
transformation formula (4.5) to both T 1 ^ and T^ + and taking 
the difference. The second ‘gamma-independent’ terms on the right- 
hand side of both the formulae are cancelled out, leaving a pure tensor 
transformation law. 

Coming now to the main result, we write 

R - Rikg ik 

so that 

6 R = bRa + R,k bg ik . 

Thus we can deduce (8.6) provided we can show that 
f v 8Rjk g lk ^J—g d 4 x — 0. To prove this result, use a locally inertial 
coordinate system to deduce that Equation (5.9) leads to 

V=gg* k S Rik = -V=g[{k k 8T l ik )j - ( g“ 6T*),,] = V=i^. 

where we write 


= 6r' t -g"6r*. 
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Here w k is seen to be a vector since it involves terms like bY' kl that 
are tensors. Then use Green’s theorem over the specified volume to 
show that, since the variations of gammas are supposed to vanish on 
the boundary, the variation part bR tk in the above expression leads to 
zero. Note that, because locally the Y symbols are zero, we can write 
w l i = w 1 .,. 

Caution. There has been a subtle departure from the usual variational 
procedure here! Normally the Lagrangian in the action is limited to first 
derivatives of any dynamical variable, so the variations of that variable 
(and not its derivative) are assumed to vanish on the boundary. Here the 
Lagrangian contains second derivatives of the metric tensor, so we need 
both bgik.i and bg ik to vanish on the boundary. In short, we are dealing 
here with a variational problem involving derivatives one level higher 
than in the standard Euler-Lagrange variational problem. 

One way of avoiding having second derivatives in the action is to 
replace R by 

s' 7 (r^„ - rj^rs) 


in the integral (8.5). One can show that the modified action also leads 
to the same Einstein equations. However, the modified integrand suffers 
from one defect: it is not a scalar! 

Ignoring these hiccups, it follows that Einstein’s equations can be 
derived from an action principle if we add to A the term 


A 


gravitation — 



V 


R V~g d 4 x. 


(8.7) 


Further, if to the scalar R we add a constant ( 2X , say) that is trivially 
a scalar, we get a modified set of field equations: 


Rik - ^ gikR + Xg ik = -KT ik . (8.8) 

We may consider this equation as representing the variation of action 
(8.7) in a spacetime region of prescribed volume, with X playing 
the role of a Lagrangian undetermined multiplier. We will consider 
these equations only when we discuss cosmology, since the extra term 
(the k-term) has cosmological significance. X is often referred to as the 
‘cosmological constant’. For the time being we move on to (8.3) and 
relate k to known physical constants. 


8.3 The Newtonian approximation 

The important question of the magnitude of k can be settled by 
examining the relationship between general relativity and Newtonian 



120 Einstein's equations 


gravitation. The first hint of a connection between Newtonian grav¬ 
itation and the present theory was provided by (7.8), where we saw 
that, provided goo did not differ significantly from unity, the difference 
(goo — 1) is proportional to the Newtonian gravitation potential. We 
now seek to formalize this relationship and thereby determine k. We 
will show that in the so-called slow-motion + weak-field approximation, 
general relativity reduces to Newtonian gravitation. 

This approximation is quantified by the following assumptions. 

1. The motions of particles are non-relativistic: v <§; c. In this case we are back 
to Newtonian mechanics. 

2. The gravitational fields are weak in the sense that 

gik = Vik + h ik , \h ik \ «; 1. (8.9) 

The inequality suggests that we ignore powers of \h ik \ higher than 2 in the 
action principle and higher than 1 in the field equations. We expect this to 
lead to a spacetime geometry not very different from Euclid's. 

3. The fields change slowly with time. This means we ignore time derivatives 
in comparison with space derivatives. This assumption asks us to ignore 
possible effects of gravitational waves. In a sense, this approximation brings 
us back to the Newtonian concept of instantaneous action at a distance. 

Let us now see how the action is simplified under these approxima¬ 
tions. First note that, with ,v° = ct, 

dy 2 — (Vik + ha) Ax' dr* « (1 + h 00 )c 2 At 1 — v 2 d? 2 . 


that is, 

ds ss ^\J 1 +/j 0 o - cAt ss ^1 + iftoo - ^r^cdf. (8-10) 

We next look at the term involving the scalar curvature. The linearized 
expression for the Riemann tensor (see (5.5)) is 

R-iklm ^ ( h k l,im "F h u n.kl h km Jl hn km )• (8.11) 

The corresponding values of R lk and R can also be calculated. Flowever, 
care is needed if we are to look at the action principle rather than the field 
equations in this approximation, for we expect quadratic expressions in 
the hik to appear in the geometrical term (8.7). 

Item 3 above eliminates time derivatives altogether. Further, the 
ratios of typical space and time displacements are 6x^/6x° = v M /c, 
where i/' are typical Newtonian velocities. Thus /zoo is more important 
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than any other ha, at least by the factor ( c/v ). We will henceforth ignore 


all other hik in comparison with h 

oo- We then get 


g m - 

i 1 — hoo. 

(8.12) 


' 1 + — hoo 

(8.13) 

and 



RV^g ~ - ( 

1 — y h 00 \ 2 h 00 . 

(8.14) 


Using these relations, we finally get the approximate action as 


A* 


1 

2k 




h oo dr 



v 2 dr + constant. 


(8.15) 


The constant represents path-independent terms that can be ignored in 
a variational problem. Here we have switched over to Newtonian three- 
dimensional notation, dropping particle labels a, b, ... and using the 
3-vector x to denote (fi =1,2, 3). We can use Green’s theorem and 
ignore surface terms. Thus, in the three-dimensional spatial volume, 
we get 



3-volume 



2-surface 



V/roo dS 


1 

2 


(V/7oo) 2 d 3 x. 


3-volume 


Since we are dynamically interested only in the 3-volume term, we 
ignore the surface term. Hence 

A ~ (V/r co ) 2 d 3 x dr — y ^ mc 2 Jh 00 dr + mJv 2 dr. (8.16) 

Now compare this with the Newtonian action 


^™-^JJw) 2 d 3 xd,-j2 mJ(f>dt + ^ ' - mJv 2 dr, (8.17) 

with cp as the gravitational potential. Clearly, (8.16) becomes the same 
as (8.17) if we put 


</> = - c h oo, 


8rrG 

c 4 


K = 


(8.18) 
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Example 8.3.1 Problem. Show by a direct Newtonian approximation of 
Einstein’s field equations and geodesic equations that the result of (8.18) 
follows. 


Solution. First note that the contraction of field equation gives R = kT , 
where T = gi k T lk . Hence the equations may be rewritten as 


R ik = -k 


Rik ~ 2 SikT 


From (8.11) we get, with h = if k h ik , 

Rik = ^ Uh ik + \{h,ik ~ h‘ Uk - h' kH }. 

Again ignoring time derivatives and retaining only h 0 0 from all h ik , we get 

Roo = — - V“/i 0 o. 

Likewise, for dust of density p, 


Too - -gooT 


= -Kp/ 2. 


Therefore the (00) component of the field equations gives 

V 2 h 00 = Kp. 


(A) 


Next consider the p. components of the geodesic equations. We write in the 
present approximation 

dv' 

17 = 0' v » 

with the 3-velocity v of a particle being small in magnitude compared with 
c = 1. The only relevant T‘ u is 

p/'- 2± A /, 

1 oo — 

so we get, from the geodesic equations 

d 2 x fl „ dv* dr' 

—— + r£ — — = 0, 

ds 2 ch d? 


the ‘Newtonian’ equations of motion 


dv 1 
dt = ~2 V/ '°°- 


(B) 


These will exactly correspond to the Newtonian equations if we define the 
potential by /*oo = 2 <j>/c 2 = 2cf> for c = 1. Equation (A) then becomes the 
familiar Poisson equation 

V 2 0 = 4jt Gp, 


provided that we define k = 8jt G (= 8jtG/c 4 ). Thus the match with New¬ 
tonian physics is complete. 
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Problem. Show that, for a spacetime of constant curvature K satisfying 
Einstein’s field equations with energy-momentum tensor 7 \ k , 

2nG 

K =- T. 

3 

where T is the trace of T ik . 

Solution. We have for the given spacetime 

bl iklm K[gjlg k m SimSkl\- 

This leads to R u = —3Kg k/ and R = —12 K. Therefore 

Rik - \gik R = -3 Kg ik + 6 Kg ik = 3 Kg ik . 

By equating this to —SnGT ik (c = 1), we get 
3 Kg ik = - 8tt G T ik . 

Hence on multiplication by g lk we get 

12 K = —SnGT. 
from which the result follows. 

Problem. In a spacetime containing pure isotropic radiation, show that a 
positive cosmological constant is needed in order to have a positive scalar 
curvature for the spacetime. 

Solution. The field equations with X are 
1 

^ik ~^SikR T hgi k — k T ik . 

Take the trace of these equations, recalling that, for pure isotropic radiation, 
T = 0. Hence we get 

R - 2R + 4X = -kT = 0, 


i.e.. 


X = -R. 
4 


Thus, for R > 0, we need X > 0. 


Thus we have completed our project of evaluating k and relating 
the relativistic framework to Newtonian gravitation. Assumptions 1 to 
3 above are known as the Newtonian approximation. It leads to the lin¬ 
ear gravitation theory of Newton, which has wide applications, ranging 
from the tidal phenomenon of the Earth’s oceans to motions of planets 
of the Solar System and of stars and galaxies in clusters. Provided that 
these three assumptions hold, general relativity does not add anything 
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new. If assumption 3 is dropped but assumptions 1 and 2 are retained, 
we are in the domain of the weak-field theory of gravitational radi¬ 
ation. For, in the weak-field limit, it is seen that spacetime-curvature 
effects propagate as waves with the speed of light. We shall discuss this 
intriguing phenomenon in detail in Chapter 11. To get the full effects of 
general relativity, however, we must drop all three assumptions and face 
the non-linear equations of (8.3) in their most general form. Naturally 
this is a complicated task, and after nearly a century of this theory there 
are only a handful of exact solutions of direct physical relevance. We 
will discuss the earliest, simplest and most important of these solutions 
in the following chapter. 


Exercises 

1. Assume that T' kl are not explicitly related to gik in the expression for R, 
which is as given in Chapter 5. Determine the form of T’ kl by requiring that 

6 J R-J—g d 4 x = 0 

for r- H r‘ kl + &r kl while the coordinates and the metric tensor remain 
unchanged. Show that this method, known as the Palatini method, leads to 
the familiar Riemannian affine connection. 

2. Verify that Einstein's equations can be obtained if instead of the term 
f R\/—S d 4 x we have the following term in the action: 

Notice that this term does not contain second derivatives of ga- However, it is 
not an invariant. 

3. Show that, if the gravitational equations are obtained from an action principle, 
subject to the restriction that the 4-volume of the region V in question, 

f */=gd 4 x, 

Jv 

remains unchanged, the Einstein tensor is replaced by 

1 

Rik ~ - gikR + ^gik > 

where A. is a Lagrange multiplier. 

4. Show from the linearized form of R ik t m that G 0 o and G 0ai do not contain 
time derivatives (of any h ik ) of order two. The equations G 0 , = i<T 0i are called 
constraint equations, which must be satisfied by any initial data specified for 
solving the problem. 
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5. Derive the Newtonian approximation of Einstein’s field equations with the 
cosmological constant. 

6. Show that, in Newtonian gravitation with the cosmological term, two masses 
can stay in equilibrium at a specific distance. Is this equilibrium stable, unstable 
or neutral? 

7. In a given volume V of spacetime the Ricci scalar R is expected to be positive. 
Why? 

8. To avoid having to demand the surface condition 6 gaj = 0 in the Hilbert 
action problem, Gibbons and Hawking suggested adding an extra term to the 
action in the form of a surface integral: 

‘ SFg L 

where 3 V is the surface of the volume V over which the Hilbert term was 
defined,is the unit normal to 3V and h' k = g‘ k — en l n k . The quantity e — +1 
for timelike n t ande = — 1 for spacelike with n,n‘ — e. Show that the surface 
variational term in the Hilbert action is now cancelled out by the variation of the 
above surface integral. 


n i:k(g‘ k ~ en 1 n k )\!—h d 3 *, 



Chapter 9 

The Schwarzschild solution 


9.1 The exterior solution 

Shortly after Einstein published his equations of general relativity, Karl 
Schwarzschild solved them to find the geometry in the empty spacetime 
outside a spherical distribution of matter of mass M (see Reference 
[10]). As we know, this is the simplest finite source of matter that gives 
rise to gravitational effects. The corresponding problem in Newtonian 
gravitation yields the solution for the gravitational potential as 



r being the distance from the centre of the spherical distribution. Per¬ 
haps it is worth commenting that this ‘simplest’ problem took Newton 
many years to solve to his satisfaction. The above solution was seen 
as the correct one for a point mass. Yet, was it the same for a finite 
spherically symmetric distribution of matter? Since Newton wanted to 
apply his theory to planets and the Moon, all extended spherical objects, 
he wanted to be clear on this issue. For example, the solution, if correct, 
does not carry any information about the size or radial inhomogeneity 
of the source. For an inverse square law of force, this happens to be cor¬ 
rect, as Newton eventually proved to himself. Today we can prove this 
result by solving the Laplace equation for a finite spherically symmetric 
source. 

Let us now look at the relativistic counterpart of this solution. We 
have to determine the spacetime metric for the non-Euclidean geometry 
outside the source. At a large distance from the centre, we expect the 
gravitational field to be weak. So under the Newtonian approximation 


126 
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we expect 


goo 


2GM 

c 2 r 


(9.2) 


We will now show how the Schwarzschild solution is obtained and how 
this exact solution relates to the above asymptotic form. 

The problem is simplified by making use of symmetry arguments. 
If the spacetime outside such a spherical distribution is empty, then its 
geometry should be spherically symmetric about the centre O of the 
distribution. So we start with the most general form of the line element 
that satisfies this requirement of spherical symmetry. 

It can be shown, using arguments from Chapter 6, that the most 
general form of such a line element is given by Equation (6.30). We 
recall the form of (6.30) here as 

ds 2 = A(r, t)dt 2 + 2H(r, t)dt dr + B(r, t)dr 2 + F(r, t){d9 2 + sin 2 # dip 2 ). 


(9.3) 

We now redefine the radial coordinate as r' by setting r' 2 — F(r , t ). 
This would lead to changes in the forms of A, B and H. However, as can 
easily be verified, the choice of a new time coordinate t' can be made 
such that the cross-product At' dr' disappears from the expression for 
ds 2 . Writing the coefficients of c 2 d t' 2 and dr' 2 as e 1 ’ and e 1 , respectively, 
and dropping the primes on the coordinates t' and r', the line element 
may be rewritten as 

ds 2 = e v c 2 d t 2 — e x dr 2 — r 2 (dd 2 + sin 2 9 dip 2 ), (9.4) 


where v and A are functions of r and t. The advantage of the exponential 
form is that, for real v and A, goo > 0 and gn < 0 as required by the 
timelike coordinate t and spacelike coordinate r. If v = A = 0, we get 
the Minkowski line element in spherical polar space coordinates. The 
non-Euclidean effects are therefore contained in the functions A and v. 
Although in this case r ceases to measure the radial distance from O, it 
still has the meaning that the spherical surface r — constant = ro (for 
example) has the surface area 4 j xr\. 

Given the line element (9.4), the next step is to calculate g‘ k , 
and r*We then calculate the components of Ry, which are given by 
(5.9) and are expressible in the form 


R _ 9r « , 

Rk, ~~lF + 


3 2 (lny^g) 
dx k dx 1 


i r-'tn 
' 1 kn A Im 


9 

dx" 


QnV=i)r' u 


(9.5) 


Since the space outside the distribution is empty, it has Ty = 0. There¬ 
fore the contraction of the field equations (8.3) gives R — 0, and these 
equations reduce to 


Rh — 0. 


(9.6) 
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We now proceed to carry out these steps of calculation. The non-zero 
components of git and T'*/ are given below, using the coordinates 
defined by x° — t,x l = r, x 2 — 0,x 3 = cp: 


goo = e 
g «° = e - 


gn = -e , 
* n = —e-\ 


gn = ~r 2 , g 3 3 = ~r 2 sin 2 6 », 

g 33 = — r~ 2 cosec 2 S, 


22 

g = -r 


Toioo = -e 1 ’ 7 Tnoo = --e v v', 

r n / 

o|oi = -e v , 

1 . . 1 . . 
Tiioi = —-e A, Toin = -e A, 


fi|ii = _ T e 7 


r 1 2112 — — 

Tl|22 = r , 
f i |33 = r sin 2 S, 


r oo = x7 


r° n , = -v', 


r 3! i 3 = — r surd, 

P 3|23 = —7 sinS cos S, 


r 2 i 33 = r 2 sinS cosS, 


1 2|33 

p 1 ^ w 

1 oo — 2 e v ’ 

1 . 


r °‘ “ 2 X ’ 


r ,, = -e 

11 2 


r 1 , , = -a', 
11 2 


r z — r — - 

1 12 1 13 — r » 

r 3 23 = COtS, 


F 22 = —re 


T 33 = —re sin S, 


T 33 = — sin 0 cos 6. 


(Here a prime denotes differentiation with respect to r, and an overdot 
denotes differentiation with respect to t.) We next compute the various 
components of the Ricci tensor. The (00) and (11) components of (9.6) 
give, after some manipulation, the following equations: 

. /A' 1 \ 1 

6 (7-p) + ^ = 0 ' (9J) 

- e (7 + p) + 7 = 0 ' (9 ' 8) 

From these we get, by subtracting (9.8) from (9.7), 

v’ + A' = 0, 

that is, 

v + A = f(t). 

The arbitrary function /(f) can, however, be set to equal zero since we 
still have an arbitrary time tranformation 


t = g(t) 
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at our disposal, which changes v to 

, ,, (dg\ 

\dt ) 

and preserves the form of the line element (9.4 ). Therefore we can take, 
without loss of generality, 


v + k = 0. (9.9) 

However, we also have, from /An = 0, 

X = 0. (9.10) 


Thus both X and v (=—A.) are functions of r only. Equations (9.7) and 
(9.8) then yield the solution 

e’’ = e _A = 1 -, A = constant. 

r 

However, if we are given that the mass of the object is M , we may use 
the boundary condition (9.2) as r —> oo to set A = 2GM/c 2 . Thus we 
get our required solution as the line element 


ds 2 



2 GM\ 
c 2 r ) 


’dr 2 



2GM\ ‘ 
c 2 r ) 


dr 2 


r 2 (dO 2 + sin 2 (l d cjr). 


(9.11) 


This is known as the Schwarzschild line element. It turns out that because 
of the symmetries of the problem the other field equations are automat¬ 
ically satisfied: we need only the (11), (00) and (01) components in 
order to arrive at the solution. One may notice that the metric behaves 
strangely for small r, namely for r < R s , where 


2 GM 

R s =—^~ 


(9.12) 


This quantity is called the Schwarzschild radius of the mass. It is easy 
to verify that this is an extremely small ‘radius’ and most known objects 
have a radius exceeding it by a large factor. For the Sun, for example, the 
Schwarzschild radius is about 3 km, whereas its actual radius is nearly 
700 000 km. Idealized objects whose radius equals their Schwarzschild 
radius are known as black holes. We shall return to a discussion of black 
holes in Chapter 13. 

The Schwarzschild solution, as derived above, is manifestly static. 
Thus there is no scope for a dynamical solution such as one involving 
gravitational radiation, even if our spherical source is expanding, con¬ 
tracting, or oscillating. This remarkable result is known as Birkhoff’s 
theorem. 
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Example 9.1.1 Problem. Find a coordinate transformation r = f(R) that 
will transform the Schwarzschild exterior line element to a manifestly 
isotropic form: 

ds 2 = e^ df 2 - e CT [di? 2 + R 2 (d6 2 + sin 2 6> dtp 2 )]. 

Find /r and a. 

Solution. On comparing the two line elements, we find 

r 2 = R 2 e a and e A dr 2 = e a d R 2 . 


By eliminating the unknown a we get 


e A dr 2 _ d R 2 
r 2 ~ It 2 ' 


i.e.. 


In R = 



2 CM\ 1/2 dr 
r I r 


This can easily be integrated to give R = \[r — GM + \Jr{r — 2GM)]. 
Some simple algebra then yields r — GM = R + G 2 M 2 /(AR). This corre¬ 
sponds to the transformation 


r = R 

The corresponding e^ and e a are 


1 + 


GM 
2~R 




We end this section with another observation. Suppose we have a 
point mass in Newtonian gravitation and we wanted to solve the Laplace 
equation to determine the potential. We could do so by invoking the 
delta function <5(x) which vanishes for any non-zero x but has an integral 
equal to unity over any interval containing the point x = 0. In the three- 
dimensional case under consideration the point mass M is represented 
by the density 


Mr) 

p = Mx-H. (9.13) 

47T I— 

The integration of the Laplace equation then leads us to the solution 
(9.1). In the relativistic version, however, we cannot do so! For we have 
a problem of singularity at r — 0. The metric diverges, so defining a 
point mass is not possible. What we have done therefore is to determine 
the constant in e y by appealing to the Newtonian limit at large distances. 
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We can, of course, avoid this issue by appealing to the finite size 
of any mass. So we next consider the extension of the Schwarzschild 
solution to a finite distribution of matter. 


9.2 The interior solution 

Let us assume that the source in the above problem is not a point mass but 
a spherical distribution of matter confined to a coordinate radius r — Rq. 
Let Tjk denote the energy-momentum tensor of the interior matter. Then 
Equations (9.7) and (9.8) get modified to 

e (7 - ~2) + ~i = SnGT ° • (9.14) 

. / v' 1 \ 1 . 

-e^ A - + -) + - = %nGT x l . (9.15) 

V r r z / r l 

Schwarzschild had investigated a special solution in which the inte¬ 
rior is an incompressible fluid of constant density p and (variable) 
pressure p. Thus the energy tensor was taken to be 

T ik =(p + p)u i u k - pg ik , (9.16) 


where u' = ( u° , 0, 0, 0) is the flow vector of the fluid at rest. Since 
u'ui = 1, we get (M°) 2 e y = 1, i.e., u° — e _v/2 . Therefore, To 0 = p, 
T\ X — —p. 

Equation (9.14) can easily be integrated. We have 


e -A = 1 — = 1 - ar 2 , 


(9.17) 


say, where a — SnGp/3 = constant. We have set X — 0 at r = 0. 

Next considerthe energy-conservation equation T k ; k — 0, for/ = 1. 
From (9.16) we get T 0 ° = p and 7) 1 = T 2 2 = = —p, so 


0 = T { * = -== ^-A^gT k ) - r [ k T t ‘ 


■sFg dx k 


_ J_ e -(v+X)/2 

r 2 dr 


2 + \{v' + \') 
L r 2 


(, 2 e ,A+v)/2 Tl ') - r“ 0 r 0 ° - (r|, + r 2 2 + 1 

9pi, (\ , 2\ 

2 v'p+( 2 V+-)p 


dr 


1 , dp 

= -2 V(P + P) -dr ’ 


i.e., 


dp 

dr 


1 

2 


(.p + pW- 


(9.18) 
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Returning to (9.15), we have v' given in terms of e\ r and p by 

1 

v' =- 1 - 1 - SnGpre k . ( 9 . 19 ) 

r r 

Substitute (9.19) for v' in (9.18), then use the definition of a and the 
expression (9.17) for to arrive at the following differential equation: 


dp 

dr 


4t rG(p + p)r f 1 ^ 

—n^-^+3 p )- 


(9.20) 


Assuming that p = 0 at the surface r — Rq, we can integrate (9.20). A 
straightforward but tedious integration gives 


P = P 


s/ 1 — ar 2 — \/1 — aR$ 
30 -o^-a/T — ar 2 


Finally, the equation for e y leads to 


(9.21) 


e v = ^ |3i/l — aRl — 0-qt 2 } 2 . (9.22) 

This is known as the Schwarzschild interior solution. 

The pressure p can be arbitrarily high, even exceeding p. The largest 
value of p is at the centre (r = 0 ) and it reaches the limit p —> oo when 
aRl = 8/9, i.e., when 


= <9 - 231 

The interior solution should match the exterior solution obtained ear¬ 
lier in Section 9.1 across the boundary r — Rq. We see that, across this 
boundary, g 22 and £33 are continuous. What about gn and goo? We expect 
that an observer moving across the boundary with a clock should not 
notice any discontinuity of time measurement. That is, e 1 ’ should be con¬ 
tinuous. From (9.22) we have at r = Rq that e v = 1 — a Rq. The exterior 
solution has e y = 1 — 2GM/Rq — 1 — 2 G x (4tv/3)RIp = 1 — aR jj. 
Thus the continuity of e y is maintained. What about e x ? We find that it is 
continuous also. In general this need not be the case. For, in measuring 
d.s' in the radial direction, we are using the exterior solution for r > R 0 
and the interior solution for r < Rq. So the prescription for measuring 
dr need not be the same in the two regions. 


Example 9.2.1 Problem. From equations set up to describe a spherically 
symmetric situation show that in general v + X < 0 at any finite r if the 
spacetime is asymptotically flat. 
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Solution. We have the two equations 


v i \ i , 

~ + ~ 2 )-Z2=-^ GT i • 


1 


4 = -8nGT 0 °. 


On subtracting the second equation from the first we get 

s~ k (v' + A') „ , 

---- = SnG(T 0 ° - T. 1 ). 

r 

Now, for most physically relevant T k ‘ , we have r o ° >0, 7j 1 < 0. For 
example, for a fluid J 0 ° = p > 0, T l 1 = -p< 0. Thus we have, from the 
above equation, v' + X' > 0, i.e., (v + X) at a finite value of r — r 0 equals 
(v + *)oo - /„>' + A.')d r. 

But, assuming spacetime to be asymptotically flat, i.e., that of special rel¬ 
ativity, (v + X)oo = 0. Since the integral itself is positive, the above equation 
gives that (v + X) at 7- = ro is negative. 

Problem. For a spherical object in equilibrium under gravitation and its fluid 
pressures, show that 


d p 
dr 


An Gr(p + p) 
2Gm(r) 


1 - 


m(r) 1 

P+ 4^\' 


where m(r) — j ' Anr^p(ry )d/-|. 

Solution. From Equations (9.14 ), (9.15) and (9.18) we get a series of results 
that lead to the desired answer as follows. On writing T 0 ° = p, we get from 
(9.14) 


, 2 Gm(r) 

e =1-, m{r) as defined. 


From (9.18) we get 


dp !, , , 

dr=-2 (P + P)V ' 


while from (9.15) we have 


! 

v' = SnGpre k -I-. 

r r 

We substitute for e A from (A) in the above equation to get 


(A) 

(B) 

(C) 


v' = - - + 
r 


1 - 


1 

~2Gm^ 


1 


+ 8nGpr 


J 


1 - 


2 Gm 


1 1 , 

—b Stt Gpr -( 1 — 

r r 


2 Gm\ 
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i 2 Gm \ 


f 2 Gm ] 

2 -h Sn Gpr 


r 

SnGr 
2 Gm 


1 - 


P + 


dnr 2 


Use this value v' in Equation (B) to get the required result. 


We will discuss in Chapter 12 situations more general than the 
somewhat artificial one assumed in Schwarzschild’s interior solution. 
Here we will return to the exterior solution and consider its dynam¬ 
ical and geometrical consequences. These turn out to have bearing 
on realistic tests that can be designed to test the validity of general 
relativity. 


9.3 Motion of a test particle 

Imagine a test particle moving in the spacetime of Schwarzschild’s exte¬ 
rior solution. By a ‘test’ particle we mean a particle that is subjected to 
the gravitational influence of the central mass M, but which does not 
in turn contribute to any gravitational effects of its own mass. Thus we 
have introduced an artificiality into the picture, which can be justified 
only if the mass of our moving particle is negligibly small compared 
with M. In this case we use the result that our test particle follows a 
timelike geodesic. 

Writing its equations in the standard form 


d 2 x' . dx k dx l 
9 —i fy ~.— — 0, 

ds’ 2 ds ds 

with x° = t, x l — r, x 2 = 9 , x 3 = (p, we get, for i — 0, 


(9.24) 


(Tt o dU 

ds 2 d? 



(9.25) 


From our earlier computations on page 128, we have the only non-zero 
r9. as rgj = v'/2. So we get for (9.25) 


d 2 t dt dr , 

ds’ 2 d? dx 


(9.26) 


This easily integrates to 


ds 


= constant = y (say). 


e 


(9.27) 
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Likewise we get, for i = 3, the following first integral: 

, d <j> 

r sin 9 — = constant = h (say). (9.28) 

ds 

We may identify y as the energy per unit mass and h as the angular 
momentum per unit mass. The i = 2 (theta) equation reduces to 


d 2 <9 2 d<9 dr 

—— H--— -sin 8 cos 9 

d.s ,z r as ds 




(9.29) 


This somewhat complicated-looking equation has a simple solution: 


6 = constant = ^. (9.30) 

It may be easily verified that, provided we ‘start’ the particle mov¬ 
ing in the 0 — n/2 plane, it will continue to move in that plane. In 
a spherically symmetric spacetime any plane through the centre may 
be chosen as a reference plane, without any loss of generality. Taking 
the 9 — it/2 plane as the chosen case, we see that the solution to the 
geodesic equations is 


d t , d d> 7i 

— = ye , r 2 — = constant = h and 6 = —. (9.31) 

ds ds 2 


Since the metric itself is a first integral of the geodesic equations, 
we get 


1 = e” 



/dr\ 2 _^ 2 /d0\ 2 

V ds / V ds / 


where e y = 1 — 2 GM/r. By substituting from (9.31), we can transform 
this equation to 


Ur 

yds 


= y 



2GM^ 



where 


V 2 (r) = 



2GM ^ 



(9.32) 


(9.33) 


is the ‘effective potential’. For motion to be possible we need V 2 (r) < 


(i) The Newtonian approximation 

In this case, 


\y - 1| «; 1, r » 2GM and It «; r. (9.34) 

The first inequality tells us that the total energy yc 2 is not much dif¬ 
ferent from the rest-mass energy, as is the case in a slow-motion approx¬ 
imation. The r 2GM inequality relates to Schwarzschild’s solution, 
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identifying a weak-field approximation. The last inequality tells us that 
the transverse velocity h/r is small compared with c (= 1). 

In this case we have 


V 2 = 1 + 2F N , 


where 



is the effective Newtonian potential for radial motion: 


(9.35) 


(9.36) 


1 

2 



+ 


F N = 


Y 1 - 1 


= Ek 


(9.37) 


(say). 

In Figure 9.1, Fn is plotted against r to illustrate the typical New¬ 
tonian situation. Notice that Fn drops from an infinite value to zero as 


Fig. 9.1. The 'Newtonian' 
approximation of motion in 
the empty exterior 
Schwarzschild solution is 
described by the 
potential-distance plot 
shown. The dotted curve 
represents radial motion. See 
the text for interpretation of 
the curves E n <, =, > 0. 
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r increases from zero to a finite value r p = h 2 /(2GM). Thereafter it 
continues to decrease as r increases. As r approaches 2 r p — h 2 /(GM), 
Fn reaches a negative minimum. Beyond 2 r p , as r increases to infinity, 
Fn increases but stays negative. It asymptotically approaches zero. 

Against this background we have three lines a, b and c drawn hori¬ 
zontally, corresponding to three values of E N , the ‘total’ energy of the 
particle. Line a corresponds to E^ — 0 and to motion in a parabolic tra¬ 
jectory. Line b has F N > 0 and describes a hyperbolic trajectory. Line 
c with £ n < 0 describes the elliptical orbits typically followed by plan¬ 
ets around the Sun. Notice that, because the kinetic energy has to be 
positive, the condition to be satisfied by the trajectory is Fn < Es. 


(ii) The relativistic orbits 

We now turn to orbits that might not satisfy (9.34). In this case we 
have to use the full equation (9.32). To facilitate the algebra, define 
dimensionless parameters 

x = r/(GM), r] = h/(GM ), a = s/(GM). (9.38) 

We then have (9.32) written as 

= (939) 

where 


The function F (x) has a maximum F max at * = x max and a minimum 
F m in at.v = Vmi n . Bothx m i n and x max are given by the equation 3 V 2 /dx — 
0, i.e., by the quadratic equation 

x 2 - ifx + 3ri 2 = 0. (9.41) 



We therefore have 

Xmax = \{v 2 ~ hV V 2 ~ 12 }’ *min = ^ j >f + <1 \/V - 12 j. 
The maxima and minima coincide for i] 2 — 12, i.e., for 

h = 2\/3 'G M, Xjnin — Xmax = 6- 


(9.42) 


(9.43) 


From the considerations of stability of orbits we deduce that circular 
orbits at x = x max are unstable, whereas those at x = x m j n are stable. 
From (9.43) we see that stable circular orbits are possible for r > 6 GM, 
the smallest such orbit having radius 6 GM. If t] —> oo, x m i n —> oo but 
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Fig. 9.2. The square of the 
effective potential for 
relativistic motion is plotted 
against distance scaled to the 
size of half the Schwarzschild 
radius. See the text for a 
discussion of the various 
curves shown here. 



Xmax —*■ 3. Thus r = 3 GM is the lower limit to the size of circular orbits 
all of which are unstable. Figure 9.2 illustrates these features. 

Another point of interest is the value of rj for which F max = 1. This 
happens at rj = 4, x max = 4. 

In Figure 9.2, V 2 is plotted against x for various values of rj. We note 
that, for rj < 2^3, V 2 has no real turning points and it increases from 
V 2 — 0 at x — 2 to V 2 = I at x —»• oo. Thus an incoming particle with 
y > 1 will fall in without a bounce. The same conclusion applies to an 
incoming particle with y > V to start with. For 2^3 < ij < 4 there are 
bound orbits like the Newtonian ellipses, provided that y < V max < 1. 
For y > V max the incoming particle falls in to be sucked into the object. 

For ij > 4 there are three types of orbits. Those with y > V max rep¬ 
resent particles that, if coming in, fall into the object. Incoming particles 
with 1 < y < V max bounce at the potential barrier (at some minimum 
r) and then move out again like the hyperbolic orbits. Similarly, for 
y < 1 < V max the orbits are bound as for Newtonian ellipses. 
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Example 9.3.1 Problem. Calculate the Schwarzschild time and the proper 
time elapsed when a particle moves once round a circular path of size r = a. 
What happens when r —»■ 3 CM? 

Solution. For travel along a circular path dr/ds = 0, d 2 r/ds 2 = 0. From 
Equation (9.32) we have by differentiation 


d 2 r 1 d , 
d? = -2d^ V) - 


Hence, at r = a we require 


Y 2 = V 2 (a) and — V 2 (r) \ r=a = 0. 

dr 


So we have two equations: 


k 2 = i — 


and 


2GM 


cP 


From (B) we get 


h 2 = 


1 + 


a 2 GM 


2GM\ ( h 2 
1 + d 2 


2h 2 ( 2GM\ 

ci J 


= 0. 


(A) 


(B) 


a - 3 GM 
Hence from (A) we get 


h 2 _ a — 2GM 
a 2 a — 3GM 


y 2 =e 2v ---. 

f a- 3 GM 

Let T be the time taken by one revolution, as measured by the t time 
coordinate. Then 

T d(j> 

T — = 27r. 
d? 

However, r 2 Atp/ds = h, i.e., for r = a we have 

, d <t> d t 
h = a — ■ —. 
d t ds 

Since e v (dt/ds') = y, wehavedf/ds = ye~ v = \Ja/{a — 3GM). We there- 
fore have 


dip h /a- 3 GM 1 

d t a 2 


GM 


1 GM 
a V a 


Therefore 


T = 


a V a- 3 GM 


2tt a 2 ' 2 


a - 3 GM 


(GM) 1 ' 2 ' 
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Since 


d? 

dv 


a - 3 GM 

the time taken as measured by the proper time of the observer is 


a — 3 GM 2na 2 ' 2 


(GM) 1 ' 2 


= 2na 


1 a - 3GM 
GM 

As a —>■ 3 GM this shrinks to zero since the observer tends to have a null 
geodesic. 


9.4 Trajectories of photons 

The null geodesics describe trajectories of photons and, following our 
earlier work (vide Section 5.2) and analogously to Equation (9.31), we 
set up their equations as follows: 

„ d t 1 , dd 7r 

e — = constant = -, r 2 — = 1, 9 — —. (9.44) 

dA b dX 2 

We have taken A. to be an affine parameter and scaled it so that the second 
of the above equations has unity on the right-hand side. Likewise it is 
convenient to write the constant in the first equation as 1 /b, rather 
than y. 

The first integral of the geodesic equation then becomes 



1 

b 2 ’ 


V\r)=- 1- 
r 


r 4(> 


2 GM^ 


(9.45) 


Figure 9.3 shows a plot of V 2 against the dimensionless ‘distance’, 
x = r/(GM). Starting at zero value for x — 2 , V 2 rises to a maximum 
value of (21G 2 M 2 )~ x at x — x max = 3 and then falls off to zero as 
x -* oo. What does this potential behaviour imply for an incoming 
photon? 

If the photon travels with b < 3\/3 GM, then it has 1 /b 2 > 
(21G 2 M 2 Y x and such a photon cannot bounce at a potential wall and 
go out again: it drops into the object. A photon with b > 3V3 GM will 
bounce and go out. What about b = 3V3 GM2 At this value the line 
touches the peak of the potential curve and the point of contact corre¬ 
sponds to a circular trajectory with radius x = 3 (r = 3 GM). However, 
this trajectory is unstable and the photon, on slight disturbance, either 
falls in or spirals out to r — oo. 
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27 G 2 M 2 



b<3j3GM 


Fig. 9.3. The curve similar to 
Figure 9.2 is drawn here for 
describing the motion of 
photons in Schwarzschild's 
spacetime. See the text for a 
discussion of the various 
horizontal levels here. 


0 


X 


2 


These considerations make an important point: that gravitation 
affects the path of light. Isaac Newton had wondered about this effect, 
when he wrote 

Do not bodies act upon light at a distance? And by their action bend its 
Rays: and is not this action [caeteris paribus] strongest at the least 
distance? 


Optics: Query 1. 


Einstein’s general relativity returns an affirmative reply to the ques¬ 
tion ‘Does gravitation affect the light track?’. Is this reply in conformity 
with reality? To find out, we move to the next chapter to discuss the 
experimental tests of general relativity. 

Exercises 

1. By considering the isometries of the spherically symmetric spacetime deduce 
that, with the notation used in the text. 


T 2 — T 
1 2 ~ J 3 


and that all components of T k ‘ with i yF k, except 7], 1 , are zero. 

2. Calculate the non-zero components of Rai m in the Schwarzschild spacetime. 
Verify that Rm m R‘ klm is finite at the Schwarzschild radius. 

3. Show that there exists a path of the light ray in the form of a terminating 
spiral given by 



in the gravitational field of a spherical object of mass M. 
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4. Show that the gravitational mass of a static spherical star of perfect fluid is 
given by 

[•R o 

M= / (p + 3p)e (l ' +A ' )/2 • 4nr 2 dr, 

Jo 

with the notation used in the text. 

5. Show that a spherical distribution of perfect fluid satisfying p = kp in hydro¬ 
static equilibrium cannot have a bounding surface r = R b < oo at which p = 0. 

6. Show that in the interior Schwarzschild solution the central redshift z c is 
related to the surface redshift z s by the relation 

2 

1 + z c — Z-(1 + z s)- 

2 — z s 

As z s —>■ 2, z c —>■ oo. (For a definition of gravitational redshift see Chapter 10, 
Section 10.3.) 

7. Write down the equation of geodesic deviation for test particles falling freely 
along radial trajectories onto the central gravitating point mass M. Interpret the 
two cases in which the initial deviation is in (i) the ^-coordinate and (ii) the 
T -coordinate. Compare your results with the Newtonian theory. 

8. Show that a freely and radially falling tachyon (described by a spacelike 
geodesic) bounces at a finite Schwarzschild T-coordinate in the gravitational 
field of a central gravitating mass M. 

9. Show that the test particles experience no gravitational forces inside a self- 
gravitating hollow spherical shell. 

10. Give dynamical arguments to show that the orbit r = 3 GM is unstable, 
whereas r = 6 GM is stable. 



Chapter 10 

Experimental tests of general relativity 


10.1 Introduction 

The general theory of relativity, like any other physical theory, must 
submit itself for experimental verification. It started with a disadvantage 
in that it was competing with a well-established paradigm, viz. the New¬ 
tonian laws of motion and gravitation. Any test that could be designed 
for testing general relativity had at the same time to show ways of distin¬ 
guishing its predictions from those of the Newtonian framework. Here 
the situation has been different from the case of special relativity. Many 
laboratory tests have been designed (see some in the opening chapter) 
to study the dynamics of fast-moving particles. For, in this case, the 
crucial factor y, distinguishing relativity from Newtonian dynamics, is 
significantly different from unity. For really testing general relativity we 
need situations of strong gravitational fields that cannot be arranged in 
a terrestrial laboratory. The differences from Newtonian predictions can 
and do exist in relatively weak fields, however, provided that we look 
at astronomical situations. Therefore astronomical tests have figured 
prominently in establishing the general theory. 

In the early days Einstein proposed three tests, which are known as 
the classical tests of general relativity. More tests emerged in later years, 
although their number is still small. In this chapter we will disuss both 
classes of tests. All except one require an astronomical setting. 

To place matters in proper perspective, let us see how ‘strong’ or 
‘weak’ the Earth’s gravitational field is at its maximum, i.e., on the 
surface of the Earth. Putting in the numbers for M and R, the mass and 
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the radius of the Earth, we get the dimensionless parameter 


11 = 


2 CM 
c 2 R 


1.2 x 1(T 9 . 


( 10 . 1 ) 


This ratio is close to 4 x 10~ 6 when evaluated for the Sun on the solar 
surface and to 2 x 10~ 8 when evaluated at a typical point on the Earth’s 
orbit around the Sun. The smallness of these numbers conveys the chal¬ 
lenge facing an experimenter attempting to distinguish between the pre¬ 
dictions of Newton and Einstein. For, to detect any effect characteristic of 
non-Euclidean geometry as expected in general relativity, these numbers 
should be closer to unity. 

The physicist cannot help making another comparison. For testing 
the electromagnetic theory, both in its classical and in its quantum ver¬ 
sion, a large number of experiments could be performed. The more 
experimental checks there are the greater the confidence inspired by a 
physical theory. Because of its somewhat esoteric nature the general the¬ 
ory falls far short in such a comparison. This was reportedly one reason 
why Einstein was not given Nobel Prizes for the special and general 
theories of relativity: apparently the experts were not convinced that 
enough experimental proofs had been provided for these theories. This 
was somewhat surprising, at least in the case of special relativity; for the 
discoverers of other effects (such as the Compton effect) involving the 
dynamics of special relativity were awarded the Nobel Prize. 

We mention these perspectives so that the reader will appreciate the 
few tests that there are! 


10.2 The PPN parameters 

Most of the present tests of general relativity are based on the 
Schwarzschild solution, and they seek to measure the fine differences 
between the predictions of Newtonian gravitation and those of general 
relativity. These form the main part of this chapter. 

Before confronting the experimental situation, however, it is neces¬ 
sary to clarify how to attach meanings to measurements in a spacetime 
that is non-Euclidean. We have already seen that coordinates have no 
absolute status, hence blindly relying on them might lead to incorrect 
results. The Schwarzschild metric (vide Chapter 9) can be used to illus¬ 
trate the concept of measurement as seen in the example below. 


Example 10.2.1 Suppose that an observer is located at a point with r = 
constant, 9 = constant, if) = constant. How does he relate the time r kept 
by his watch to the coordinate tl From the principle of equivalence we 
know that, since dr = ds/c measures the observer’s proper time in a locally 
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inertial frame, being a scalar, it does so in all frames. For our observer, 
dr- = 0, d8 = 0, d<j> — 0; so from (9.11) we get 


dr = [ 1 — 

This gives the required answer. 


2 GM\ 

c 2 r J 


1/2 


dr. 


The experimental tests mostly revolve around the Schwarzschild 
line element as applied to objects in the Solar System. However, beyond 
comparing the relativistic predictions with the corresponding Newtonian 
ones, there has also been interest in other theories of gravitation. Some 
of these, such as the Brans-Dicke theory, will be discussed in Chapter 
18. These theories use a spacetime metric as in relativity, but come 
up with line elements different from Schwarzschild’s. All these can be 
simultaneously looked at in their weak-field limits and by comparing 
their predictions in the various experiments. A series of parameters can 
be used to specify the various components of the metric with reference 
to these tests. Since we are looking at a level of approximation one 
step beyond the Newtonian limit, the procedure is called parametrized 
post-Newtonian approximation or simply the PPN approximation. The 
parameters are denoted by y, P, f, oq, aq, oq, and fi. ? 2 . & . We 
will not discuss the details of how these parameters are derived in a 
particular theory (see Table 10.1 from Reference [18], a review by 
C. M. Will, which is given at the end of this chapter), except to identify 
the first two, which have values of unity in general relativity and occur 
explicitly in the classical tests of this theory. The rest have value zero in 
relativity. 

To identify p and y, we express the Schwarzschild line element in 
the isotropic form, in which the spatial part of the metric is the Euclidean 
one multiplied by a radial function: 

ds- 2 = eV d t 2 - e"[d R 2 + R 2 (d8 2 + sin 2 d dtp 2 )], (10.2) 


where // and rj are functions of the new radial coordinate R (see Example 
9.1.1). By expanding these in powers of (M/R) we get 


e" = 1 


GM 
2 —— 
C 2 R 



1+2 y 


GM 

~C 2 R' 


(10.3) 


where, as mentioned earlier for general relativity, both p and y are unity. 
For some other theories they may have different values. We will later 
summarize the present status of the measured values of these parameters. 
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Fig. 10.1. Signal 
communication between two 
observers A and B, in a 
general spacetime with an 
inhomogeneous gravitational 
field. 


10.3 The gravitational redshift 

Imagine two observers A and B moving in spacetime exchanging 
light signals with each other. In the general situation illustrated in 
Figure 10.1, we have B transmitting two successive wavefronts at instants 
Bi and B 2 corresponding to successive peaks of intensity. The proper 
time elapsed between these instants as measured by B is, say, A B . To B, 
these measurements bring the information that the wavelength of light 
just released by him is A B x c, where c is the speed of light. 

In the corresponding geometrical optics we may argue that these 
wavefronts reach A along null rays and £ 2 , reaching points Ai and A 2 
on the world line of A. To A the proper time gap between the receipts of 
these two wave peaks is A a . To him therefore the wavelength received 
will appear to be A a x c. Denoting the wavelengths emitted (by B) and 
received (by A) as A B and k A , respectively, we define the spectral shift by 

z= [Aa-Ab] = 4 a -a b . (iq4) 

In the optical spectrum, the red colour is towards the long-wavelength 
end and the blue/violet colour is at the short-wavelength end. Hence, 
if in optical astronomy the observer finds that z > 0, the result is 
described as redshift. Likewise, if z < 0, the result is called blueshift. 

We will encounter various applications of this result under different 
physical conditions. The application most physicists are familiar with 
is that due to pure motion, known as the Doppler effect. As described in 
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Chapter 1, this arises when there is relative motion between A and B. In 
a later chapter we will consider the cosmological redshift arising from 
an overall expansion of the Universe. Here we describe the spectral 
shift arising from passage of light between static inhomogeneous 
gravitational fields. 

10.B.1 The gravitational spectral redshift 

Consider any static line element - that is, one in which do not depend 

on x° = ct. Suppose we have two observers A and B with world lines 

x' 1 — constant = a 'L b 'h (10.5) 

respectively. Let (j be a null geodesic from B to A, with parametric 
equations given by 

x‘ = x‘(X), (10.6) 

withx'TO) = b M ,x u (l) = a M ,.r°(0) = ct b,x°(1) = ct\. What does our 
geodesic correspond to in physical terms? 

It describes a light ray leaving observer B at time t B and reaching 
observer A at time t A . Because of the static nature of the line element, 
we also have another null geodesic solution given by 

x fl = x fl (k), m= L2,3, (10.7) 

x° = x°(A) + Ac, A = constant. 

This describes a light ray £> leaving B at t B + A and reaching A at 
/a + A. Figure 10.2 illustrates this result. 



Fig. 10.2. Signal 
communication in a static 
spacetime. If B is in a stronger 
gravitational field than A, the 
signals from B to A will show a 
redshift. 
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Now, in the rest frame of B, the time interval A corresponds to a 
proper time interval (measured by B) of 

Sr B = Ateool^)] 172 . 


If n light waves have left B in this time interval, then the frequency of 
these waves as measured by B is 


vb = ^-tgoo(^)] 1/2 - 

Since the same number of waves is received by A in the corresponding 
proper time interval 6r A , we get the ratio of frequencies measured by B 
and A as 


ub 

Va 


'goo(a'') 

-gooibn 


1/2 


( 10 . 8 ) 


This is also the ratio of the wavelengths A a : 1b measured by A and B, 
respectively. 

If in the Schwarzschild solution B is an observer located on the 
surface of a star, at r — r s , say, and A is a distant observer with r 'j>> 
2GM/c 2 , we get 


^ / _ 2GM y 1 / 2 
V c 2 r s J 


(10.9) 


Thus spectral lines from a massive compact star should be redshifted. 
For 2 GM/(c 2 r s ) small compared with unity, the redshift 


A,a — A-b ^ GM 
A. b c 2 r s ' 


( 10 . 10 ) 


White dwarf stars like Sirius B and 40 Eridani B do show redshifts in 
the range of 10 4 to 10~ 5 , which are of the right order of magnitude. 
More reliable and quantitatively accurate measurements, however, are 
possible only in a terrestrial experiment. 


10.3.2 The Pound-Rebka experiment 

In the first such laboratory experiment, which was carried out in 1960, 
R. V Pound and G. A. Rebka measured the change in the frequency 
of a gamma-ray photon emitted by an excited cobalt nucleus as it fell 
from a height of 60-70 feet. For details of their experiment, see Refer¬ 
ence [11], Figure 10.3 gives a schematic description. As such a photon 
falls through a height H, the Newtonian potential energy per unit mass 
increases by gH, where g is the acceleration due to gravity on the 
Earth’s surface. Because of this increase of energy, the photon should 
undergo a blueshift; that is, its frequency increases by a fraction gH/c 2 . 
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Fig. 10.3. A schematic 
diagram describing the 
Pound-Rebka experiment on 
the measurement of 
gravitational spectral shift. The 
upward arrow indicates the 
alignment of the iron nucleus 
to receive the gamma ray 
coming from above. 


Although this fraction is as small as Hr 15 , it can be measured by modern 
laboratory techniques. 

The trick is to have an iron nucleus as the absorber at the bottom. 
By moving the nucleus at a suitable speed away from the approaching 
gamma-ray photon falling from above, one can effectively reduce its 
apparent relative frequency with respect to the iron nucleus. When the 
frequency matches the energy gap between the cobalt and iron nuclei 
absorption occurs. The speed of the iron nucleus then tells us what 
blueshift the gamma ray had. 

This experiment had been thought of much earlier, but ensuring 
that there would be no recoil problems with the absorbing nucleus had 
proved to be difficult, until the discovery of the Mossbauer effect. The 
nucleus could then be held in a crystal. The recoil was largely borne by 
the holder as per the Mossbauer effect. 

The Pound-Rebka experiment and later work have confirmed the 
gravitational redshift effect to a high level of accuracy. 

Notice, however, that we used Newtonian gravity applied to the pho¬ 
ton to derive the expected result. We could have also used the relativistic 
formula (10.8) with the approximation on the surface of the Earth, with 
the same result. Thus a defender of Newtonian gravity could argue that 
the formula (10.10) does not uniquely confirm general relativity. So we 
now turn to two other tests, which do precisely that. 
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10.4 The precession of the perihelion of Mercury 

An outstanding mystery in planetary astronomy at the beginning of the 
twentieth century had been the anomalous behaviour of the orbit of the 
planet Mercury. It had been found that, slowly but surely, there was a 
secular motion of the perihelion of Mercury’s orbit. The effect can be 
described as follows. 

If we denote the polar coordinates on a plane by (r, 0), then, in the 
Newtonian approximation, the planet describes an ellipse given by its 
polar equation 


/ 

- = 1 + ecos(0-0 o ). (10.11) 

r 

Here l is the semi latus rectum, e the eccentricity and 0o the direction in 
which its perihelion (point of closest approach to the Sun) lies. 

Observations of the orbit of the planet Mercury had revealed that 
0o is not a constant. Rather the perihelion precesses steadily at a small 
but perceptible rate of 5600 ± 0.401 arcseconds per century. Of this, 
~5025 arcseconds per century could be explained by the fact that the 
observations from the Earth are in its non-inertial frame of reference - 
the Earth spins about an axis and also goes round the Sun. Thus there 
was a remaining amount of about 575 arcseconds to explain. Of this, 
all but 43 arcseconds per century could be accounted for by the pertur¬ 
bation of Mercury’s orbit by the Newtonian gravitational effect of other 
planets. How should one account for the remaining 43 arcseconds per 
century? Notice that the amount to be explained is minuscule: it works 
out that Mercury’s perihelion is seen to advance by about a 35 000th 
part of a degree at the end of one orbit (see Figure 10.4). The fact that 


Fig. 10.4. A grossly 
exaggerated picture of the 
advance of the perihelion of 
Mercury. The actual effect as 
described in the text is minute 
but significant. 
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this discrepancy was a matter for worry shows the high expectations 
scientists (astronomers and physicists) had of the Newtonian paradigm. 

Indeed in the 1860s, when the discrepancy became a matter of 
concern, U. J. J. Leverrier in Paris had offered a solution to the problem 
on the basis of his experience two decades earlier when a discrepancy 
had been noticed in the orbit of Uranus. At that time Leverrier had 
suggested that the orbit of Uranus was being perturbed by a new planet 
in the vicinity. (A similar prediction had been made by John Couch 
Adams.) The explanation had worked and a new planet, later called 
Neptune , was identified as the cause of the discrepancy. This time, 
Leverrier suggested a similar explanation: look for an intramercurial 
planet as the perturber of Mercury’s orbit. He even named the new planet 
Vulcan. Alas, no such planet was found despite exhaustive searches. 
So the discrepancy remained a possible demonstration of a failure of 
Newton’s laws of gravitation and motion. 

This was taken as a challenge by proponents of the general theory 
of relativity. Taking the Sun as the source mass M in the Schwarzschild 
solution and Mercury as a test particle, one can work out the orbit of 
Mercury. Following the treatment of the problem given in Section 9.3, 
we arrive at the following equation for a planetary orbit: 


1 


r 


4 





2 GM \(1 
r /\/z 2 


+ 



We simplify this by writing u — 1 /r: 


( 10 . 12 ) 



Y 2 ~ 1 
h 2 


— u -|- 


2 GMu 
h 2 


+ 2 GMu 3 . 


(10.13) 


Differentiate this relation with respect to f. After taking out the 
common factor du/df, we get the equation as 


d 2 u 
d 4>‘ 


u — 


GM 

lx 2 


4- 3GMu 2 . 


(10.14) 


A comparison with Newtonian orbital dynamics will reveal that 
here we have an extra term on the right-hand side. In the Newtonian 
framework such a term would have arisen from an extra force obeying 
an inverse fourth-power (r~ 4 ) law. The extra force is small compared 
with the Newtonian force. Thus its effect on the Newtonian orbit would 
be small. We will try to estimate it with the problem of Mercury in view. 

Let us write the solution of the purely Newtonian equation (i.e., 
without the last term) as 


«o = t[1 + e cos(0 - 0o)], / = ——. 

/ C j M 


(10.15) 
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Next we try a perturbative approach to solve Equation (10.14) by 
substituting for u the above solution for the (small ) second term on the 
right-hand side of the equation. Thus we write 

d 2 u GM 3GM , , 

— + u = —— -\ -—[1 + 2e cos(0 - 4> 0 ) + e 2 cos 2 (0 - <p 0 )]- (10.16) 

dtp 1 h z /- 

Try a solution of this equation in the following form: 


U — Uq + U\ + U2i 


(10.17) 


where 


d 2 Ui 


6 GM 

d (f> 2 

+ U\ — 

, 2 e cos (tp <p 0 ), 

d 2 U2 


3 GM , , 

d <p 2 

+ U2 = 

/2 [1 + cos 2 (0 4> 0 )] 


The complementary functions for these equations are cos </> and sin 
(j). The right-hand side of the second of the above two equations does 
not contain any of these functions, so we will get a bounded oscillatory 
solution from this equation. Such a solution will not explain a secular 
behaviour like that shown by the perihelion of Mercury. 

The other equation, namely the first of the above two equations, does 
have these functions on the right-hand side, so we do expect a secular 
solution here. Indeed, a particular integral of the differential equation is 

3 GMe 

u i = —=—(p sin {<p - 4> 0 ). (10.18) 

On adding this to the Newtonian solution, we get the approximate 
secular solution as 

U = U q ll i — 


where we have assumed that 

3 GM 

e = — i —x<p (10.19) 

is a small quantity so that cos e is taken as unity and sin e is approximated 
by e. Thus we see that the direction to the perihelion is not constant at 
the value cpo, but changes its magnitude at a steady rate as illustrated. 
This precession of perihelion at a rate computed over a period T of one 
orbit (when </> changes by 2n) around the Sun (M = M 0 ) is given by 

6ttGM q 
IT c 2 


. 3GM 

1 + el cos(4> - </>o) H --— <j> sin(0 - 4> 0 ) 


[1 + e cos (cp - <t> 0 - e)]. 


n = 


( 10 . 20 ) 



10.5 The bending of light 153 


On putting in the values 1 = 5.53 x 10 12 cm and GM e — 1.475 x 10 5 
cm and one century = 4157 7 , we get n as 43.03 arcseconds (per century). 
This was an excellent resolution of a long-standing anomaly and it went 
a long way towards establishing the credibility of general relativity in 
the minds of physicists and astronomers. The value of n is largest for 
Mercury, which of all the planets has the orbit which is most eccentric 
and closest to the Sun. Given the same major axis, the latus rectum 
/ in the denominator of formula (10.20) is small for an orbit of high 
eccentricity. 


1 0.4.1 The binary pulsar 

In the late 1970s, a more dramatic example of such a precession was 
observed for the periastron of the binary star system that houses the pul¬ 
sar PSR 1913+16. Here the gravitational effects are stronger than in the 
Sun-Mercury system, and the precession rate is as high as 4.23 degrees 
per year - about 3.6 x 10 4 times the value for Mercury. (See Reference 
[12].) We should caution the reader, however, that, unlike in the Sun- 
Mercury case where, because of the large disparity of their masses, the 
Sun could be considered at rest and Mercury moving around it as a ‘test’ 
particle, in the binary pulsar case the two stars have comparable masses 
and hence the Schwarzschild solution is not strictly applicable. Ideally 
one should solve the relativistic two-body problem. This has not been 
possible so far, so only an approximate extrapolation of the Sun-Mercury 
problem is generally used for the above theoretical comparison. 


10.5 The bending of light 

The perihelion precession of Mercury was the best of the three tests of 
general relativity in establishing a clear stamp of the theory. However, 
in terms of a popular impact, the test to be described now played a key 
role in establishing the superiority of general relativity over Newtonian 
gravity and making Einstein a celebrity. 

Does gravity affect light by bending its direction? When Newton 
described his law of gravitation as universal, he meant it to be applicable 
to all forms of matter, large and small. But what about light? Did the 
universality extend to light rays also? Newton was not sure. We have 
mentioned his query on this issue in Section 9.4. 

Nevertheless, if we take some liberties with the Newtonian concept, 
we can get an affirmative reply to this query. Imagine light made of 
particles, e.g., photons. A photon of frequency v would have an energy 
hv and hence a mass equivalent of hv/c 2 . Let such a particle approach 
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E Y 

Fig. 10.5. The direction from 
the Earth (E) to the star (S) 
is changed because of the 
'bending' of light by gravity. 
The star image accordingly 
is shifted (to S'). This shift, in 
reality, is quite small and is 
shown in an exaggerated form 
in this figure. 



a mass M from infinity such that its asymptotic direction of motion 
passes it at a distance b. Assume that the particle had velocity c at 
infinity and it travelled like a typical particle in Newtonian dynamics. 
What will be the final direction of the particle as it recedes from M at 
infinity? The bending angle works out as 2G M/(c 2 R), where R is the 
distance of closest approach of the particle. In what we will refer to as 
the ‘Newtonian prediction’ we imply this value. 

While developing the general theory of relativity, Einstein had an 
earlier version that gave exactly this answer. In 1911 he put it up for 
testing should a suitable opportunity arise. The way to test the result is 
explained in Figure 10.5. The source of light, say a star S, is viewed 
from the Earth, E, on two occasions, once when the Sun is grazing its 
line of sight as shown in the figure and once when the Sun is nowhere 
near the line ES. As shown in Figure 10.5, the light ray from the star 
approaches the Sun at a closest point P on its surface. Although the ray 
is said to be bent by gravity, as we know, it is really following a null 
geodesic in the curved spacetime produced by the Sun’s gravity. Clearly, 
under normal circumstances it is not possible to sight the star with the 
Sun in the foreground. The exceptional situation when we can see the 
star is when the Sun is totally eclipsed. If one measures the direction of 
the star at this stage and compares it with the direction under normal 
circumstances (when the Sun is nowhere near) one should find a small 
difference, corresponding to the bending angle predicted in Figure 10.5. 

There was an eclipse due in 1914 that would be visible from Russia 
and a team of scientists from Germany went to observe it and to perform 
this experiment. But World War 1 broke out and the team members were 
interned as aliens from an enemy country. This turned out to be fortunate 
for Einstein, in a way; for in 1915, when he arrived at the final form of 
relativity, which we are studying here, he found that under it the correct 
answer was twice what he had got earlier. In short, his prediction of the 
bending angle had changed. It would have been embarrassing for him 
had the 1914 expedition gone ahead and found a result that disagreed 
with his then prediction. Meanwhile, amongst the few astronomers who 
understood what relativity was all about was Arthur Stanley Eddington 
at Cambridge. Eddington proposed an eclipse expedition that would test 
Einstein’s claim after the war had ended. A total solar eclipse in 1919, 
visible from a band in the southern hemisphere, was of sufficiently 
long duration to attempt the observations. Fortunately the war ended 
in 1918 and a financial grant of one thousand pounds from the British 
Government enabled Eddington to execute his plans. 

The trials and tribulations of the experiment and its report to the 
joint meeting of the Royal Society and the Royal Astronomical Society 
on 6 November 1919 have been described in a very absorbing account 
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by Peter Coles [13]. We will next carry out a brief calculation of the 
expected result. 

Just as timelike geodesics determine the tracks of planets, we can 
calculate the track of a light ray by solving the equations of its null 
geodesic. These equations were written down in general and solved in 
the case of the Schwarzschild exterior spacetime geometry in the last 
chapter: vide Equations (9.44). These can be combined to form a single 
equation given by 


-f—\ - - 

r 4 \dtp ~*~ 


2GM 


b 2 ' 


In terms of u = 1 jr this equation takes the form 




dtp 


1 


— + u z = — + 2 GMu\ 


b 2 


( 10 . 21 ) 


( 10 . 22 ) 


Differentiation with respect to cp gives 

^4 + u = T>GMu 2 . (10.23) 

dtp 2 

We will solve this non-linear equation in a perturbative fashion 
as we did in the case of Mercury’s orbit. Looking at Figure 10.5, 
we first represent the tangential straight line YPY' as the zeroth-order 
(no bending) approximation: 


1 

«o = — cos tp. 
Ro 


(10.24) 


Here P is the point of closest approach to the Sun and it is given by 
the maximum value of Uq, i.e., by (p = 0. Ro, the closest distance, is, 
ideally, the Sun’s radius. On substituting this solution into the right-hand 
side of Equation (10.23) we get the next approximation satisfying 



3 GM 


Rq 


■ COS 2 (p, 


which has the solution 


(10.25) 


1 GM , 

u — — cos tp -—(2 — cos 2 tp). (10.26) 

Ro Ro 

Now, looking at Figure 10.5, we see that the above equation describes 
the curve EPS, the asymptotes of which are denoted by u — 0, </> = ±<po, 
where 


<t> o — 


it 2GM 1 

2 + ^T J 


(10.27) 
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since we expect 2GM/Rq to be small compared with unity. The net 
deflection of the light ray is therefore given by 

4 GM 

A <j> — -— = 1.75 arcseconds, (10.28) 

Roc 2 

when we substitute the values for the Sun. To enable the numerical 
computation we have restored the speed of light c to its proper place. 

In his report to the scientific meeting in November 1919, Edding¬ 
ton compared his values of A <j> with the theoretical predictions of 1.75 
arcseconds (Einstein) and 0.875 arcsecond (Newton). He had measure¬ 
ments from two places, from Sobral in Brazil and Principe in Guinea. His 
conclusion was in favour of Einstein. This observational confirmation, 
besides its high impact on the media, went a long way towards estab¬ 
lishing the credibility of general relativity in the eyes of the physicists. 

Yet, looking back at those results in a dispassionate way, one could 
point to several uncertainties that might have weakened that conclusion! 
The experiment itself was not performed under the best of conditions 
and the equipment used had room for improvement. The positions of the 
stars when the Sun was not in the vicinity could not be measured from 
the same location, thus introducing a possible source of error. Also, the 
optical refraction effects in the upper layers of the solar atmosphere were 
not adequately estimated. Thus there was a strong reason for repeating 
the experiment whenever the eclipse opportunity presented itself. 


10.5.1 Measurement at longer wavelengths 

Subsequent attempts by optical astronomers yielded somewhat incon¬ 
clusive results, largely because of the limited sensitivity of the measuring 
equipment and the uncertain nature of systematic errors. In the 1970s, 
however, measurements with microwaves confirmed the above bending 
angle much more precisely with only about 5% experimental error. Pio¬ 
neering measurements were made by Counselman and others in 1974 
(see Reference [14]) and by E. B. Fomalont and R. A. Sramek [15] in 
1975. The technique used was to observe the quasar 3C-279, whose line 
of sight intersects the Sun every October. Since the Sun is not a pow¬ 
erful radiator of energy at these wavelengths, there is no need to wait 
for a solar eclipse in order to make the measurements. The direction to 
another source, 3C-273, was used as a reference point for measuring the 
shift in angle. 

This technology has subsequently been improved to reduce the error 
bars further, as can be seen in Table 10.1 at the end of this chapter. 
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10.6 Radar echo delay 

Just as the direction of a light ray is altered by the Sun’s gravity, so is 
its apparent travel time. This effect, which was first highlighted by L. 
1. Schiff, can also be calculated in a straightforward manner. We first 
derive the result and then discuss how it is put to test. We again refer to 
the null-geodesic equations (9.44) in the Schwarzschild spacetime. From 
these we get, by eliminating the independent variable X, the following 
equation: 



Let KPE denote a null-geodesic track from a planet K to Earth 
E, grazing the Sun’s surface at P as shown in Figure 10.6. The radial 
Schwarzschild coordinate r is measured from the Sun’s centre. At P, the 
conditions are r = Rq , the Sun’s radius, while the ‘shortest-distance’ 
requirement means dr/d? = 0 at P Let r = R\ at K and r — AS at E. 
What is the time taken by light to travel the path KPE? 

We break the answer into two bits, the time T taken by light to go 
from K to P and the time taken from P to E: 


T = f(R l ,R 0 ) + f(R 0 ,R 2 ), (10.30) 


where the / functions are formally similar and defined by the integrals 


/'(Ai, R 0 ) = 
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(10.31) 


and 


f(R 0 , Ri) = 



dr f 
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(10.32) 


Consider the first integral and let D(r ) denote the denominator in it. 
Requiring that dr/d? = 0 at P means that the function D(r) vanishes at 
r — R 0 . This determines b and we have 


D(r) = 


('—){-# 


1 - 


2 ¥)T 


(10.33) 


We now implement the weak-field approximation by using the fact 
that r > Rq^> 2 GM. Some straightforward but tedious algebra then 
leads us to 

(i *°T f i GMR ° 2GM } 

V, r 2 J \ r(r + R 0 ) r J ' 


K 



E 


Fig. 10.6. The echo of a radar 
signal bounced off a Solar- 
System body K may arrive late 
if the signal path goes close 
to the Sun. This, as explained 
in the text, happens because 
of the geometry of curved 
spacetime nearthe Sun. 


D(r) = 


(10.34) 
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Therefore, we have 


f(RuRo) 



GMR 0 

1 + r(r + R 0 ) + 


2GM 

r 


dr. (10.35) 


In the absence of any local gravitational field we would have got the 
simple expression of Euclidean geometry for f(R\, Rq): 


r 2 Y' /2 

l--fj dIr. (10.36) 

We thus see that the time taken by light to travel is increased in the 
presence of local gravitational sources. In principle we could measure 
this effect by bouncing a radar signal from a planet when it is in superior 
conjunction with respect to the Sun and the Earth; that is, when the radar 
signal to and from the planet grazes the Sun. By comparing the to and fro 
time for the signal with the radar time taken when the Sun is nowhere near 
the signal path one can test the above prediction. The estimated effect 
for Mercury is the highest and close to 200 p.s. However, in practice 
there are several sources of errors in this procedure. The distance of the 
planet must be known to an accuracy of 1.5 km or so to ensure that the 
error of the travel time does not exceed 10 p.s. Also the bounce region 
on the surface of the planet should be relatively small to minimize the 
spread in the arrival of the return signal. 

In the 1970s, the first serious measurements were made by bouncing 
radar signals emitted from the spacecraft Mariner 6 and 7 off the surface 
of the Earth as the signals grazed the solar limb. The expected delays of 
the order of 200 p.s were observed within 3% error bars [16]. This test 
has also been made more accurate with time and Table 10.1 gives the 
updated information. However, the experiment performed by the Cassini 
spacecraft on 10 October 2003 during its mission to Saturn improved 
the accuracy of the experiment so as to reduce the error bars to 0.002% 
[17], Here signals were bounced between the spacecraft and the Earth 
as they grazed the Sun in between. 


f(RuR 0 ) = 


IRo 


10.7 The equality of inertial and 
gravitational mass 

An important consequence of the principle of equivalence is the equality 
of inertial and gravitational mass. A little thought will convince us that 
Galileo’s experiment from the Leaning Tower of Pisa, which demon¬ 
strated that all bodies fall freely with equal rapidity, is an essential part 
of Einstein’s thought experiment involving the freely falling lift. Both 
experiments are possible because the same quantity enters the law of 
motion as inertial mass and the law of gravitation as gravitational mass. 
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Experiments with lunar laser ranging have been successful in mea¬ 
suring the distance of the Moon from the Earth to within a few centime¬ 
tres. Such experiments also demonstrate that the Moon moves around 
the Earth as predicted by the equations of general relativity. In particular, 
these experiments ruled out certain alternative theories of gravitation, 
like the Brans-Dicke theory, that allow for the variation of the inertial 
mass of a moving object as a function of its distance from another mass. 

Laboratory experiments of the torsion-balance type have been con¬ 
ducted very accurately with various materials to establish this equality 
with high sensitivity. Such experiments place stringent upper limits on 
the possible presence of a ‘fifth force’ operating at a range of a few 
metres. For a review of the measured accuracy of the principle of equiv¬ 
alence, see the article by C. M. Will [18]. 

10.8 Precession of a gyroscope 

Although the Schwarzschild solution describes the gravitational effects 
of the Sun or the Earth with great accuracy, there is scope for further 
improvement. For instance, a rotating mass would introduce a d<j> dr term 
into the metric. Although the effects of such terms are very small for the 
Earth or the Sun, modern technology should be able to measure them. 

A proposed experiment that can measure the effect of a rotating 
mass makes use of gyroscopes. The axis of a gyroscope sent on an 
equatorial orbit around the Earth will slowly precess. An estimated rate 
of precession of ~7 arcseconds per year can be detected with present 
technology, and such an experiment has been on the drawing board for 
four decades, but not yet performed. The Gravity Probe B mission at 
present in space has promised a result by the year of writing of this 
account (2009). 

Table 10.1 gives the measured values of the PPN parameters or, 
rather, the limits set on their deviation from the predictions of general 
relativity. Although the experiments described go beyond what we have 
outlined above, it is clear from the entries of Table 10.1 that the theory 
of relativity comes out with flying colours. 


Exercises 

1. A photon of energy 1 MeV travels from the Earth to the Moon. By looking 
up physical data on the Earth and the Moon, calculate its energy upon arrival at 
the Moon. 

2. Use formula (10.20) to calculate the perihelion precession of Pluto, for e = 
0.25, perihelion distance 29.7 AU, aphelion distance 49.1 AU and orbital period 
248 years. 
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Table 10.1. Limits on the measured values of the PPN parameters 
(based on data reviewed by C. M. Will) 


Parameter 

Effect 


Limit 

Remarks 

Y ~ 1 

Time delay 

2 

x 1(T 3 

Viking ranging 


Light deflection 

3 

x 10" 4 

VLBI 

(5-1 

Perihelion shift 

3 

x 1(T 3 

J 2 — 10~ 7 from helioseismology 


Nordtvedt effect 

6 x 10” 4 

>1 = 4(5 — y — 3 assumed 


Earth tides 


1(T 3 

Gravimeter data 

<*i 

Orbital polarization 

4 

x 10~ 4 

Lunar laser ranging 



2 

x 10~ 4 

PSR J2317 + 1439 

a 2 

Spin precession 

4 

x 1(T 7 

Solar alignment with ecliptic 

ot 3 

Pulsar acceleration 

2 

x 1(T 20 

Pulsar P statistics 

'1 

Nordtvedt effect 1 


io - 3 

Lunar laser ranging 

ft 

- 

2 

x ltr 2 

Combined PPN bounds 

?.2 

Binary acceleration 

4 

x ltr 5 

P p for PSR 1913+16 

?3 

Newton’s third law 


1(T 8 

Lunar acceleration 

(4 

- 


- 

Not independent 


1 Here tj = 4/3 - y - 3 - 10f/3 - ffl - 2a 2 /3 - 2^/3 - ? 2 /3. 

N.B. The general theory of relativity predicts that all entries in this table are 
zero. 


3. The Newtonian escape velocity of a massive star is v. Show that it bends light 
by an angle of 2ti 2 /e 2 . 

4. An equilateral triangle is described by the space-tracks of light rays grazing a 
spherical object of mass M and Schwarzschild coordinate radius R 0 . Show that 
the sum of the three angles of this triangle exceeds n by an amount 

2lV3 GM 

in the approximation 2 GM <^C Roc 2 . 

5. A source of light moves in a circular orbit of Schwarzschild coordinate radius 
2 GM/( about a spherical mass M in an otherwise empty space. It emits light 
in the forward direction of its motion, of frequency v 0 in its rest frame. This is 
received by a remote observer located at rest at a Schwarzschild coordinate radius 
R 2 GM, with a frequency v in its rest frame. Show that, for f approaching 
| from below, 


v _ (1 — § 4T )(2 — 2f ) 1/2 
v 0 (2-2 f) 1/2 -f 1/2 ' 
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6. The Newtonian potential for an oblate sun has the form 

where J is the quadrupole-moment parameter. Show that this produces a per¬ 
ihelion precession (of purely Newtonian origin) of a planet with orbital latus 
rectum 2/ by an angle 

3 xRZj 

l 2 

per orbit (R Q is the radius of the Sun). Estimate this effect for Mercury in terms 
of J. (The present estimate of J is J < 2.5 x 10 -5 .) 

7. Assume Newtonian physics and imagine a particle sent towards a spherical 
mass M with velocity c along a direction such that the perpendicular distance 
of the centre of the mass from that line of motion is R. If GM <§; c 2 R, find 
the distance R 0 of closest approach of the particle to M and show that the 
particle will eventually be moving away from the gravitating mass in a direction 
asymptotically making an angle 2GM/(c 2 Rq) with the original direction of 
motion. (Note that this is half the value predicted by general relativity.) 



Chapter 11 

Gravitational radiation 


11.1 Introduction 

Do Einstein’s equations permit the existence of gravitational waves? As 
in the case of electromagnetism, where Maxwell’s field equations led to 
the important deduction of electromagnetic waves carrying energy and 
momentum with the speed of light, one expects the relativistic equations 
to imply the existence of gravitational waves that do the same. How¬ 
ever, several issues intervene to make the answer to our question non¬ 
trivial. 

The first problem is posed by the non-linearity of the Einstein 
field equations. In the wave motion discussed in electromagnetic the¬ 
ory, acoustics, elastic media, etc. the basic equations are linear and a 
superposition principle holds. There is no corresponding situation in 
general relativity. Secondly, there is no corresponding vector or tensor 
in relativity that plays the role of the Poynting vector in the transport of 
electromagnetic energy. 

A third difficulty arises from the general covariance of the field 
equations. With the facility available to use any coordinate system as 
per convenience, it is not clear whether a particular ‘wavelike’ solution 
is a real physical effect or a pure coordinate effect. Thus one has to be on 
guard against solutions that describe coordinate waves that may travel 
‘with the speed of thought’. 

Even during Einstein’s lifetime, the above question did not receive an 
unequivocal answer. His long-time coworker Leopold Infeld has narrated 
one incident when Einstein thought that he had a proof that disproved the 
existence of gravitational waves, only to find at the last moment before 
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he was to give a seminar on this finding that his proof broke down in a 
tautology. 

In the post-Einstein years, however, relativists like Hermann Bondi, 
Ivor Robinson, Felix Pirani and Roger Penrose were able to sort out 
real physics from the coordinate effects and generate confidence that 
gravitational waves exist. Still, the general non-linear description of a 
wave is too complicated for this elementary treatise and we will study 
only the linearized version in this chapter. It is this version that is relevant 
to the practical issue of detecting gravitational waves passing through a 
man-made receiver. 


11.2 Linearized approximation 

We return to the ‘weak-field’ approximation discussed in Chapter 8; but 
this time we drop the restriction of ‘slow motion’ and retain all time 
derivatives. Then we have 

Riklm = ^ [him.kl "F ^kl.im ^ km.il hil,km\‘ ( 11 - 1 ) 

There is, however, a freedom of coordinate transformation still avail¬ 
able. This allows us to choose certain auxiliary conditions as follows. 
Define 

= h = h\ (11.2) 

and choose coordinates such that 

^*, = 0. (11.3) 

In analogy to a similar transformation of the electromagnetic poten¬ 
tials A k , which ensures that A k , k — 0, this transformation is called a 
gauge transformation and the f k are called gravitational potentials. The 
conditions (11.3) are called the gauge equations. We still have a free¬ 
dom of coordinates available. Thus, if we try a coordinate transformation 
given by 


x' 1 =x* □£' = 0, 


(11.4) 


we still satisfy (11.3) for the primed coordinates. We shall have occasion 
to use this facility later. 

With (11.3) holding, we get for the various relevant tensors 


Rik — 2 


R = - Dh, 
2 


1 1 

Rik ~ -»?ikR = - Of ik . 
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The wave operator of course relates to the Minkowski spacetime 
of special relativity. From Einstein’s field equations, the above relation 
leads to 


□V'jyt — —l6nGT ik . 


(11.5) 


We can write a formal solution to this equation, assuming that the 
sources (T ik ) are confined to a bounded compact 3-volume V : 


r ) — 4G 


T ik (t — | r — R|) 

I r — RI 


d 3 R. 


( 11 . 6 ) 


What type of sources does one talk about? We shall return to this 
important issue shortly. 


11.2.1 Plane waves 

A different type of solution to that given by a compact source is one of 
plane waves. (Compare this case with that of a plane electromagnetic 
wave.) 

Take a coordinate system as x° — t, x 1 — x, x 2 = y, x 3 = z, with, 
as usual in this discussion, c = 1. For the present problem we assume 
that apart from the wave the space is empty. So the equations (11.5) 
yield 

Dh ik = 0. (11.7) 


For a plane wave travelling in the x direction all h ik are functions of 
(t — x) only. Hence the gauge conditions become 


9 # dfi 9 

dx 3 1 3 1 3 1 


( 11 . 8 ) 


so that [i/rP — iis a function of x only. However, if we are admitting 
only wave-type physical functions, then this may be set equal to zero. 
Thus we have i/', 0 = i'l ■ 

Next, using the freedom provided by Equation (11.4), we can use 
the arbitrary §' as a function of (t — x) to make the following quantities 
vanish: 


fi, tl, fi, i/'f + V'f- 

Since i/^, 0 = ir }, we also have zero values for \jr \, i/fj and . Further, 
V/q = i/^o = —tA? = 0. Hence t/r- = 0 and we have h ik — i jr ik . In short, 
the only quantities that cannot be rendered zero are i/r 3 and i/zf — V-G’- In 
terms of the h ik , we can say that the plane wave is characterized by two 
functions, 


h 22 = -h 33 and h 23 . 
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This is the most elementary plane-wave solution. It may be compared 
with the plane electromagnetic wave travelling in the x direction having 
an electric field £2 in the y direction and a magnetic field B 3 in the z 
direction, with | £ 2 1 = |^ 3 |- 

What is the energy flux in such a wave, analogous to the Poynting 
vector 


<t> = —(E x B) 

4tt 

for the electromagnetic case? In Newtonian gravitation one can define 
an energy density of gravitational field in terms of the potential (p as 
follows: 

(V0) 2 
8 nG ' 

The notion of gravitational energy in general relativity has been a 
subject of a lot of discussion and controversy, especially since relativity 
discards the notion of force in the context of gravity. Nevertheless, a 
limited meaning can be attached to the concept of gravitational energy 
density and flux of that energy in a wave. Since a detailed description of 
the topic would lead us into technical details away from the main theme 
we are describing, we will simply quote the result, leaving the reader 
to look up advanced texts such as References [19, 20]. In the case of 
the plane wave described above, the energy flux of a plane gravitational 
wave is given by 


T = 


\6tcG 


^23 + t (^22 — ^ 33) 2 


(11.9) 


The over dot denotes differentiation with respect to t. Thus we have 
a well-defined expression for the energy carried by a gravitational wave 
as it travels with the speed of light in vacuum. We will next look at 
sources that can produce gravitational waves and the way they can be 
detected. 


11.3 Radiation of gravitational waves 

Again we refer the reader to advanced texts (e.g., References [20, 21]) 
for the somewhat intricate manipulation needed to derive an apparently 
simple result, which we just state below. Assuming that we have a com¬ 
pact time-dependent source confined to a 3-volume V, the gravitational 
wave emerging from it will appear to a remote observer as a plane wave 
passing by him. Using local coordinates in which the x-axis is along 
the direction of propagation of the wave, we can use the results of the 



166 Gravitational radiation 


previous section. Calculations give, for a source located at distance Rq, 

20 ■ 20 . 

hli = ^22 — ^33 = T~Z~{D 22 ~ D 33 ), (11.10) 

where we define the quadrupole-moment tensor of the source by the 
standard formula: 


D a p = 


p(3x a x p - S a pr 2 )dV. 


( 11 . 11 ) 


Here the coordinates x a are Cartesian and r denotes the Euclidean 
distance from the origin of a point with these coordinates. 

As in the case of the oscillating electric dipole radiating electro¬ 
magnetic waves, we can work out the net loss of energy by the source 
above through gravitational radiation. The answer comes out to be 


V = 


O h 2 
45 ? 


( 11 . 12 ) 


Thus V is the power radiated by the source leading to its energy reservoir 
being depleted. 

We have deliberately restored the velocity of light to its rightful place 
in the above formula. Its high power (5 ) serves to tell us that, unless the 
third time derivative of the quadrupole moment is enormously high, the 
power radiated is insignificant. 

Let us estimate the energy radiated by a laboratory source that is in 
the form of two masses of M kg each, going round each other with a 
period of a millisecond, the overall length scale of the apparatus being 
L metres. A crude estimate of the quadrupole moment of this system is 


D = Mx L 2 x 10 7 g cm 2 . 


while the triple time derivative of it would be as high as ML 2 x 10 16 
c.g.s. units. Now multiply by the coefficient G7(45c 5 ), which is approx¬ 
imately 5 x 10 -62 . Thus we get a minuscule power of ~ 5 x 10” 24 erg 
per second for M — 10 and L = 10, say. Clearly it is very unlikely that a 
terrestrial technology can in the near future produce a laboratory-based 
source of gravity waves of any practical significance. 


11.4 Cosmic sources of gravitational waves 

Given the above calculation, it is clear that we need to look to the cosmos 
for sources strong enough to be noticed. We outline some that are likely 
to play a significant role in gravitational-radiation astronomy. 
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11.4.1 Coalescing binaries 

Consider a binary star system of two stars with masses m\ and m 2 
moving round one another. As is well known, the Newtonian equations 
of motion are given by 


mif'i 


Gm\rri2 


(*i - rQ 
I r 2 — r ! 1 3 


012*2 — Gtn 1/W2 


(n - »~2) 

l r 2 — *"1 | 3 


These are solved by going to the barycentric frame of reference. The 
centre of mass has coordinates 


m 1 ri + 012*2 
m \ + oi 2 


(11.13) 


and it satisfies the condition R = constant. Since gravitational radiation 
involves the third time derivative of the quadrupole moment (vide Equa¬ 
tion ( 11 . 12 )), without loss of generality we may set R as constant. We 
also have, from ( 11 . 13 ), 



As shown in Figure 11 . 1 , r is the vector (rj — r 2 ). A general solution for 
this vector is the same as for a particle of reduced mass m \ 012/(111 1 + m 2) 
moving under the gravitational field of a single mass (m 1 + m 2 ). Such 
a particle can describe a bound (elliptical) or unbound (parabolic or 
hyperbolic) orbit. For a binary star the former type of orbit is relevant. 
We take the special case of a circular orbit. Assuming an angular speed 
of co and orbital radius r, we get the rectangular coordinates of the two 



Fig. 11.1. This figure shows 
stars A and B following their 
elliptical orbits while their 
centre of mass C remains 
stationary. Such a system 
emits gravitational waves. 
With loss of energy, the orbits 
shrink. A and B come closer 
and closer and ultimately 
coalesce. BA is the vector r. 










168 Gravitational radiation 


stars respectively as 

Xi = r cos(wr), y\ = r sin(cor), 

x 2 = —r cos (cot), y 2 = —r sin(W). 

Using these coordinates and taking note of the above transformations 
we get for the two masses 

|Dn| = 24-l r -^-rV|sin(2wOI; 


in j + m 2 
in \ m 2 , , 

|Z> 2 ->| = 24- r 2 af‘\sm{2a>t)\ 

m\+m 2 


(11.14) 


so that, averaged over a period, the value of D afi is 


288 


/ in {tn 2 \ 2 
\ttii + m 2 ) 


Thus the radiation rate is 


V = 


32 G ( m\fn 2 \ 2 
5 c 5 


/ m\tn 2 y 
V m i + m 2 ) 


Now the energy of the binary is given by 

Gm l m 2 


£ = -- 


2 r 


We also have the third Kepler law from orbital dynamics: 

r 3 co 2 = G(m \ + m 2 ). 


(11.15) 


(11.16) 


(11.17) 


So, with the loss of energy through gravitational radiation, the mag¬ 
nitude of £ increases and this means that r decreases. From the relations 
derived above we have 



and a little manipulation leads to the rate of shrinkage of the orbit as 


64G 3 miin 2 (mi + m 2 ) 
5c 5 /' 3 


(11.18) 


and this in turn tells us that the period P of the binary is reduced at the 
rate 


1927rG 2,5 (//7i + m 2 ) l/2 mim 2 
5c 5 r 2 - 5 


(11.19) 


This phenomenon becomes more dramatic as the binary stars get 
closer and move faster and faster. The radiation rate also increases and 
becomes dramatically large when the stars finally coalesce. Needless 
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to say, those looking for evidence for gravitational radiation opt for 
coalescing binaries as their most favoured option. 

Although an example of this kind is still to come, the above result 
was strikingly but indirectly verified when observations were made of 
the binary pulsar PSR 1913+16. This is a pulsar that forms a pair 
with another neutron star. The above result was applied to this binary 
system. Studies by J. H. Taylor and J. M. Weisberg in 1984 led to the 
conclusion that the period of this binary system was decreasing at the 
rate of 2.4 picoseconds per second [22]. Such a tiny measurement was 
possible because pulsars are good timekeepers. The observations of the 
relative positioning of the two members in the binary system give further 
information on the shrinking of the orbit. It was shown that the result was 
consistent with the general relativistic theory of gravitational radiation 
and did not give such a good fit to some of the alternative theories of 
gravitation. 


11.4.2 Explosive sources 

More dramatic than binaries are sources like supernovae, active galactic 
nuclei, mini-creation events (predicted by an alternative cosmology as 
mentioned in Chapter 18), etc. in which a large mass undergoes a rapid 
redistribution over a substantial volume. Recalling the formula (11.12), 
we carry out a crude order-of-magnitude calculation as follows. 

Let M be the mass involved and R its characteristic linear size. Then 
the quadrupole moment of the distribution will be of the order of ?; M R 1 . 
Here ij is a dimensionless number, which also includes information on 
the anisotropy of the matter distribution. For example, a spherically 
symmetric matter distribution will have rj — 0. Further, let T denote the 
characteristic time scale for change of the system. Then the third time 
derivative of the quadrupole moment is given by 

- _ ijMR 2 
° = T 3 

and formula (11.12) gives the radiation rate as 


v ^_C_ x OiMR 2 ) 2 
45c 5 X T 6 


( 11 . 20 ) 


Let us take a supernova and set M — 10 M Q , R = 10 12 cm and 
T = 10 4 s. With rj = 10” 1 the above formula gives 


V = 2Ax 10 29 erg s _1 . 


Compared with the Sun’s luminosity, this is a fraction as small as 
10~ 4 . It may of course be possible that in a particular frequency band 
over a short time scale the emission may be much higher than the average 
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Fig. 11.2. The world lines 
of two particles A and B are 
shown. If a gravitational wave 
passes by, the separation vec¬ 
tor V 1 of these world lines 
would change. 


calculated above. This now brings up the question of detectors: how are 
we to detect the signals, which may be very weak astronomically? 


11.5 Experimental detection of 
gravitational waves 

How do we detect gravitational waves passing through space in our 
neighbourhood? As in the case of any wave motion, we need an inter¬ 
acting substance to ‘respond’ to the passing wave in such a way that 
we can spot and measure the response. In this case the clue is given by 
the equation of geodesic deviation derived in Chapter 5. Equation (5.24) 
may be rewritten in a slightly different notation, as 

5 2 V‘ . . 

-j-r = Kt.uv'u'". (ii.2i) 

To recapitulate, as shown in Figure 11.2, two neighbouring geodesics 
Ta and T B , describing two free test particles A and B separated by 
a small separation vector V 1 , come closer or move apart depending 
on the nature of the ambient spacetime geometry. The above equation 
shows the variation of V 1 with respect to the affine parameter, u, as the 
particles move along their geodesics. The passage of a wave will change 
the geometry and hence the value of the driving term R'kim- So the 
mechanical movement of A and B will indicate the presence of the wave. 

The practical problem which a detector has to handle is how to trans¬ 
late the acceleration produced by (11.21) so that it can be measured, given 
that it is a very small effect amidst other environmental and hence noisy 
effects of larger magnitudes. Since the early 1960s mechanical detectors 
have been designed to capture the small and elusive gravitational waves. 
We describe three major attempts. 


11.5.1 Bar detectors 

Joseph Weber played a pioneering role in designing a detector and using 
it for measurements. He used cylindrical bars, each with length 153 
cm, diameter 66 cm and weight 1.4 tons. Each bar was suspended by 
a wire in vacuum and mechanically decoupled from its surroundings. 
Ideally the bar should be completely isolated from its surroundings. The 
bar has a fundamental frequency of 1660 Hz for lengthwise oscillation. 
Figure 11.3 shows a bar detector. 

The bar has piezoelectric strain transducers. When a gravitational 
wave passes through the cylinder, different parts of it feel the acceleration 
and the tendency to be displaced from their normal positions causes them 
to be strained. The strain transducers respond to these strains and produce 
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Fig. 11.3. A bar detector of 
gravitational waves located in 
Frascati near Rome. 


corresponding electric fields. Measurements of these fields lead to the 
estimate of the intensity of the gravitational wave. Because of resonance 
at 1660 Hz, the detector is most sensitive to waves of frequencies near 
this value. 

Weber had sited a detector in the University of Maryland and another 
in the Argonne National Laboraotory near Chicago. If the source of 
gravitational radiation were a distant cosmic one, it would affect both 
detectors in the same way. Thus Weber took as significant only those 
results which had the two detectors responding simultaneously. The rest 
could be dismissed as part of local noise. 

Even so, judging by the level of effects he got as significant, their 
sources had to be much more powerful than those we have looked at 
here. In short, the gravitational-wave community doubted, despite the 
care taken with the measurements, that real gravitational waves had 
been detected. Of course, had there been another detector with a dif¬ 
ferent technology also reporting positively, the results would have been 
accepted. The controversies relating to the reality of Weber’s finding 
remained unsettled throughout his lifetime, despite the high regard he 
was held in by his peers. 

Nevertheless, although bar detectors exist in at least six laboratories 
now, it was generally felt that new technology was needed to improve 
the sensitivity and to reduce background noise. We next describe the 
interferometer technology that has been employed in several present- 
generation detectors. 
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11.5.2 Laser interferometers 

We encountered Michelson’s interferometer in Chapter 1 while dis¬ 
cussing the Michelson-Morley experiment. It creates two paths into 
which a beam of light splits. After following the paths, the two beams 
recombine. At that stage there will be interference between the waves 
and how they combine will depend on their phases. If there is a slight 
difference in the lengths of the two paths followed by the waves, that 
difference will determine the outcome of interference. 

Since an interferometer is very sensitive to small changes in path- 
lengths, it is ideal for measurements of gravitational waves. The idea 
is that, as a gravitational wave crosses the interferometer, it causes 
changes in pathlengths because of the changes in geometry (vide Equa¬ 
tion (11.21)). If the effect can be measured, the intensity of the original 
wave can be estimated. 

The layout of the most ambitious of these detectors, the Laser Inter¬ 
ferometric Gravitational Observatory (LIGO) is shown in Figure 11.4. 
The two paths, shown there at 90°, form the letter L. A partially 
reflecting/transmitting mirror serves as a ‘beam splitter’ at the vertex 
of L. The light beam is in the form of a laser. The arms of the L are each 
of length 4 km and the beams are allowed to make several rounds along 
the two paths. To keep the laser beams focussed, and not dispersed, it is 
desirable to make them travel through vacuum. The higher the level of 
vacuum, the less the dispersion of the laser waves and the sharper their 
tracks. 

If the two split beams meet with the same phase, they are allowed 
to continue making the round of the interferometer. If there is a path 


Fig. 11.4. The LIGO detector 
(photograph by courtesy of 
the LIGO team). 
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difference, a part is diverted to a photomultiplier and measured. The 
chances are that what is collected there arises from the passage of grav¬ 
itational waves. In a way, the interferometer measures the gravitational 
signals through their conversion to electrical ones, just as a microphone 
converts sound waves into electrical disturbances. How large are the 
expected signals? 

We may express the answer in the form of the magnitude of h ik gen¬ 
erated by a typical source. For typical ‘strong’ sources the magnitudes 
are about 10 -20 . Thus, for a pathlength of 10 km, the length fluctua¬ 
tions to be measured are about 1CT 14 cm. A typical optical wavelength 
is 500 nm, i.e., 5 x 10~ 7 cm. This puts into perspective the precision 
needed in these measurements. 

At present LIGO is functioning and taking routine observations. It 
has two identical detectors, one in Washington state and the other in 
New Orleans; thus they form a long baseline in the SE-NW direction of 
the mainland USA. It is hoped that simultaneous detection of signals by 
both the instruments would lend credibility to the finding. It is believed 
that LIGO’s sensitivity is just below the threshold for detection and 
upgrading of its capabilities is going on. There are other similar but 
smaller detectors in three other places, two in Europe and one in Japan. 

In short, there is considerable interest in the campaign to detect grav¬ 
itational waves. The question is whether current technology is capable of 
doing so. Perhaps this thought has prompted scientists to be even more 
ambitious and think of space as the place of detection. We will look at 
this possibility next. 

11.5.3 LISA from space 

A major problem with terrestrial detectors is that of ‘noise’, which 
includes seismic disturbances and disturbances produced by terrestrial 
sources. Because the signal to be expected is very small, these noises 
have to be minimized and/or accounted for by sophisticated techniques 
of data analysis. Indeed, data-analysis techniques have played a major 
‘software’ role in the case of terrestrial detectors. 

Nevertheless, to minimize the noise, it is now proposed to have a 
space-based mission called the Laser Interferometric Space Antenna 
(LISA). As shown in Figure 11.5, it is made up of a trio of spacecraft 
forming an equilateral triangle, which follows at a distance of 20° behind 
the Earth on the same orbit. Its plane will make an angle of 60° with the 
Earth’s orbit and the triangle will face the Sun. Like the Earth, it will 
take one year to orbit the Sun. 

The vertices of the triangle will carry two mirror reflectors each for 
reflecting laser rays so that they describe the equilateral triangle. The 
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Fig. 11.5. A schematic 
picture of the LISA project 
(photograph by courtesy of 
NASA and the ESA). 



arrival of a gravitational wave will modify the geometry in the vicinity 
of the triangle, which will be detected through the laser interferometry. 

This ambitious project is currently scheduled for completion in 2016, 
if funding is made available. It has been initiated by NASA and the 
ESA, although other space organizations are expected to join to make it 
a ‘world project’. 

11.6 Concluding remarks 

Gravitational waves pose a challenge to human intellect and technical 
achievements. If general relativity is right, one can find support for it 
in the detection of gravitational waves emitted by cosmic sources. Yet, 
the task is not an easy one, as we saw. The minute effects to be found 
and measured require the present technology to be pushed beyond its 
cutting edge. 


Exercises 

1. Using formula (11.19), estimate the rate of decrease of the period of a close 
binary star system with component stars of masses M e and 2M 0 , moving in a 
circular orbit with separation 10 15 cm. 

2. A supernova core starts to collapse at t — 0 while retaining its ellipsoidal 
shape with its axes maintaining the same ratio. The starting values of the principal 
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quadrupole moments of the core were (5,4, 3 ) in units of M e R |. During collapse 
all dimensions of the core homologously shrink oc (to — t) 1/3 with to — 100 s. 
Calculate the gravitational radiation emitted by the core per second when its 
linear size is half its starting value. 

3. Assuming that the rate of radiation of gravitational waves by a source of 
quadrupole moment is proportional to l-D^I 2 , deduce formula (11.12) 
except for its numerical coefficient, using dimensional analysis. 

4. A cylinder of length L, cross-sectional radius R and uniform density p is 
made to spin at rate to about an axis perpendicular to the length of the cylinder 
and passing through its midpoint. Show that it radiates gravity waves at the rate 
2G M 2 L 4 to 6 / (45c 5 ), where M is the mass of the cylinder. Estimate this rate for 
R = 1 m, L = 20 m, p = 7.8 g cm -3 and to = 28 radians s _1 . (For an iron 
cylinder a spin rate of to = 28 rad s -1 is the maximum spin that can be borne 
by its tensile strength.) (This problem has been taken from Gravitation by C. W. 
Misner, K. S. Thorne and J. A. Wheeler, Freeman (1970).) 

5. From (11.18) estimate the time within which the binary stars will coalesce. 



Chapter 12 

Relativistic astrophysics 


12.1 Strong gravitational fields 

So far we have been concerned with gravitational effects that are weak, 
even when we were talking of effects requiring post-Newtonian approx¬ 
imation. To give the example of Schwarzschild’s solution, in the term 

we assume that the departure from unity is small. Even for the compact 
white dwarf stars the difference |e'~ — 11 is less than 10” 3 . Thus we have 
been able to ‘get away with’ linearizing Einstein’s equations. While this 
has served our purpose in the limited applications of the theory, we 
have not been confronted with its inherent non-linearity. One reason 
why physicists and mathematicians studying relativity did not get into 
such confrontations for many years after the inception of the theory 
was because nature did not present a scenario where the full non-linear 
impact of the theory could be felt. However, one may look upon the 
year 1963 as a watershed when nature did oblige the relativist with such 
examples. 

In 1963, thanks to the cooperation between optical and radio 
astronomers, the so-called quasi-stellar objects (QSOs or quasars in 
brief) were discovered. These are compact-looking sources emitting 
optical and radio radiation. The first two quasars to be discovered, 3C273 
and 3C48, were at first mistaken for stars in our Galaxy. 1 Later studies 

1 3C273 is the 273rd source in the third Cambridge catalogue of radio sources. 
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revealed features that did not fit in with this interpretation and led to 
the conclusion that they are extragalactic, located far away like some of 
the most distant galaxies. Here we emphasize their role as radiators of 
huge amounts of energy from within a compact region. What ‘energy 
machine’ led a quasar to generate so much radiation from within a 
compact volume? 

Prior to this discovery, theoreticians such as Fred Hoyle and William 
A. Fowler had conjectured about the existence of very massive stars. 
Supermassive objects, as they came to be known, had masses in the 
range 10 6 —10 10 solar masses. M e is the symbol for a solar mass and it 
serves as a mass unit in astrophysics: M Q = 2 x 10 33 g. These authors 
had realized that such massive stars, if they exist, cannot sustain a 
luminous phase for long, since their nuclear resources would be unable 
to generate enough pressure to withstand the self-gravity of the ‘star’. 
For a Sunlike star the equilibrium can be sustained for a long time 
because the pressures generated by the nuclear reservoir can effectively 
counter the gravitational contraction. However, if we increase the mass 
in the calculation, the nuclear reservoir grows at a rate proportional to 
star-mass M, whereas the gravitational energy grows as M 2 . Clearly 
such supermassive stars would evolve fast and consume their nuclear 
energy in a matter of a few thousand years. Being unable to maintain a 
steady volume, such a supermassive star would shrink and shrink, until 
its gravitational environment became very powerful. Hoyle and Fowler 
suggested that the contraction of these objects would be very rapid and 
the energy so generated (which was gravitational in origin) would be 
radiated by the star [23]. 

In the early months after the discovery of quasars, astrophysicists 
realized that a follow-up of the Hoyle-Fowler idea would lead them into 
territory they had never trod before, viz., an environment of a strong 
gravitational field that demanded the full application of general relativity. 
Thus there was convened an international conference on ‘Relativistic 
Astrophysics’, to which general relativists as well as theoretical and 
observational astronomers were invited. This meeting, held in Dallas, 
Texas, was to be the first of a biennial series of meetings known as ‘Texas 
Symposia’. The name recalls the early association of Texas with these 
meetings, which have since been held in different parts of the world. 
Quasars such as 3C273 (see Figure 12.1) dominated the discussion of 
the first Texas meeting, although the scope of relativistic astrophysics 
has since expanded and widened. 

We now look at some aspects of supermassive objects to demonstrate 
how relativity makes a significant difference to the evolution of a massive 
system, compared with the Newtonian theory. 
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Fig. 12.1. A photograph of 
the optical image of the 
quasar 3C273. A sign of the 
unusual nature of the source is 
the jet seen emerging from 
the source. (Photograph by 
courtesy of NASA and the 
ESA.) 



12.2 Equilibrium of massive spherical objects 

To see how relativity makes a difference, we first look at the Newtonian 
equations describing the equilibrium of a spherical star of mass M and 
radius /4- 

We choose radial coordinate r indicating distance from the centre, 
and denote by m(r) the mass contained within a concentric sphere of 
radius r. We denote by p(r) the pressure at distance r and by p(r ) the 
density at that distance from the centre. We expect these quantities to 
decrease monotonically from the centre outwards, with pressure vanish¬ 
ing at the boundary given by r = R\,. 

We then have the mass-density relation for r < Rb, which is purely 
geometrical: 


d m(r) 
dr 


4 nr 2 p(r). 


( 12 . 1 ) 


The second equation is of hydrostatic equilibrium. At any interior point 
with radial coordinate r, the mass within, m(r), pulls (gravitationally) 
any material at r inwards, whereas the pressure gradient at r seeks to 
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prevent this. The balance is described by the differential equation 


d/> 

dr 


-pir) 


Gm(r) 


( 12 . 2 ) 


These equations have to be supplemented by an equation of state 
relating pressure p to density p. For example, if the star can be approx¬ 
imated by a ‘polytrope ’ of index n, the relation is 


p oc p 


l + l/H 


(12.3) 


Usually, for an ordinary star, n = 3 is a good approximation. For 
details of stellar structure see, for example, References [24,25]. For the 
relativistic discussion we will follow Fowler [26]. 

In the relativistic case, we start with the Schwarzschild coordinates 
with the line element written as 


ds 2 = e v At 2 — e A dr 2 — r 2 (d8 2 + sin 2 d d cj> 2 ), (12.4) 

where X and v are functions of r only in a static situation. Given the 
energy tensor of a perfect fluid with bulk velocity u ', pressure p and 
density p, 

T ik = (p + p)u i u k - pg ik , (12.5) 


we get the following solution of the Einstein equations for X in the 
interior of the supermassive star as defined by 0 < r < Rb'. 

e -Hr) = j _ 2Gm(r) ' 

r 

where, for any R satisfying 0 < R < Rb, 

f R 

m(R) — / 4nr 2 p dr. 

Jo 

Note that the integrand above includes the term T°o, which in this 
case is p. Thus our equation (12.7) above finds an echo in the New¬ 
tonian equation (12.1). We also have M = m( R/,) as the gravitational 
mass of the star. The Newtonian equation of hydrostatic equilibrium has 
a more complicated relativistic counterpart, however. For writing the 
energy-conservation equations T lk = 0 for i = 1 gives the following 
relationship between the pressure gradient and mass: 


( 12 . 6 ) 


(12.7) 


dp 

dr 


4nGr(p + p) f m(r)\ 
j 2 Gm(r) X ^ ) ' 

r 


( 12 . 8 ) 


This equation is exact and in the ‘Newtonian limit’ of p p and 
Gm{r) r it does reduce to Equation (12.2). Of course, as in the 
Newtonian case, we still need an equation of state if we are to be able to 
solve these equations. 
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The Schwarzschild interior solution (see Chapter 9) may be seen 
as an extreme example of this approach since it supposes that p — 
constant. Sometimes this is considered an example of the hard-core 
nucleon density. The core is incompressible and therefore has a constant 
density. We will take it as a limiting case in what follows. 


12.3 Gravitational binding energy 

The notion of binding energy, as it exists in the Newtonian framework, 
can be defined in relativity also. This leads to the definitions of two 
masses: the nucleonic mass M n and the gravitational mass M. We have 
already come across M as the value m(R h ) 

To understand the relationship of these notions, imagine the super- 
massive star to be broken into its basic constituents, the nucleons (pro¬ 
tons and neutrons), which are then transported to ‘infinity’. To this end, 
work has to be done against the gravitational force of the star. The mass 
of the star in this infinitely dispersed state includes the energy equivalent 
of this work and will therefore be greater than the mass in the compact 
bound state. The mass at infinity is simply the sum of the masses of all 
nucleons in it. (Strictly speaking, we should include electrons also in 
this count; but their contribution is negligible (about 5 x 10 4 ).) This is 
defined to be the quantity M n mentioned before. 

The quantity 

B = M n -M (12.9) 

is the binding energy of the star, and this is the work done in distributing 
it to infinity. Thus, for a bound object, B > 0 and it tends to zero in 
the infinitely dispersed state of the object. We may define M n more 
concretely by the integral 

r^b 

M n — 4jt / p 0 i? 2 e A/2 dr, (12.10) 

Jo 

where po is the rest-mass density (see Chapter 7): that is, in a unit 
proper volume, count the number of nucleons and multiply it by their 
rest masses before adding up. The volume element multiplying p 0 in 
the integral above is the proper volume at constant t of a spherical shell 
sandwiched between coordinates r and r + dr. Assuming that there is 
no change in the number of nucleons as the object expands or contracts, 
we take M n to be constant. 

12.3.1 The Schwarzschild interior solution revisited 

Let us apply these concepts to the Schwarzschild interior solution derived 
in Chapter 9. Using the notation of Chapter 9, we have the inequality 
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(because po < p) 


M n < 



4jt pr 2 e x/2 dr 


2 np 


sin 1 (cr^o) — 'J a ^o \/1 — 


( 12 . 11 ) 


For the limiting case of infinite pressure at the centre given by (9.23) 
we have the maximum of the right-hand side and hence, for that case, 


M a < 


2np 




( 12 . 12 ) 


Taking the case of the hard-core nucleon potential, we set the con¬ 
stant density p equal to the nuclear density ~10 15 gcm~ 3 . Then the 
above inequality will give 


M O <3M 0 . (12.13) 

This means that the maximum mass that can be supported in this way 
is no more than three solar masses. Flence we get a perspective on how 
difficult it is for supermassive stars with masses millions of times the 
solar mass to remain in equilibrium after their nuclear resources have 
been exhausted. We consider such stars next. 

12.3.2 Supermassive stars 

We follow the discussion given by Fowler [26] and write the difference 
between actual and rest-mass densities as the internal (thermal) energy 
density 


u = p - po. 


(12.14) 


Then the binding energy is given by 

r R b 


C R b 


B= 4i rr e ' (p — u)dr — / 4jt r p dr 

Jo Jo 


f‘ R b 


4:xr 2 p 


i _ 2Gm(r) j _ ^ 


dr 


" b a 2.. (. [\ 2Gm(r)^ 


47 rr u \ l — 


dr. (12.15) 


These relations apply for the static case when the object is in equi¬ 
librium. Let us consider the Newtonian situation first. In the Newtonian 
approximation of the above expressions we assume that u <£ p and 
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neglect the term Gm(r)/r in the second integral but not the first: 


r R b 


B = 


Gm(r) 


C R b 


p • 4jtr 2 dr — / 4nr 2 udr. 


(12.16) 


The first integral is the gravitational binding energy, which for a 
polytrope of index n is given by (see Reference [25]) 


Q. = 


3 GM 2 
(5 -n)R h 


(12.17) 


On writing u — 3ep, the second of the above integrals becomes (for 
a constant c) 


r R b 


r R b 


r^b 


47 xr 2 u dr = — 


4jtr 2 x 3 ep dr = 


4ner 3 — dr. 
dr 


Using (12.2), the equation of hydrostatic equilibrium, we get 

r R b r R b 


4nr 2 u dr- = — 


4n 


Gm(r) 3 


r pe dr- 


r R b 


Gm(r) , 

c- p ■ 4nr 2 dr- = —cO. 


When the coefficient e varies within the star, we may replace the constant 
€ by an average value (e). Then, from Equation (12.16) and the above 
relation, we get the binding energy as 


B = -((e>- 1)0. (12.18) 

If B > 0, the star has negative total energy, relative to the case of infinite 
separation. If B < 0, the star has positive total energy, which normally 
should come from the star’s nuclear reactions. If the star cannot supply 
this energy, it cannot be in equilibrium: it will contract. 

12.3.3 The post-Newtonian approximation 

Let us now look at the same problem from the next level of approxi¬ 
mation, viz. the post-Newtonian approximation. In this case Equation 
(12.8) becomes 

( Gm(r)\ dp Gm(r ) ( p 4npr 3 3 Gm(r)\ 

( ' -) a - = ~Pb') -j— (1-1-1-—— -I-I • 

\ r J dr r L \ p m{r) r J 

(12.19) 

Using this relation together with (12.1), which remains unchanged in the 
relativistic case, we get back to the two integrals of Equation (12.16). In 
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analogy to Equation (12.18), we get 


b 

B = —((e) — 1)£2 — 8nG{e) / pm(r)r dr 

1 


— 12;rG [(e)-- 


JO 

f‘ R b 


[m(r)f dr. 


Here the average (e) is with weights different from those in the Newto¬ 
nian case discussed earlier. 

For supermassive stars, i.e., for stars of a million times the solar 
mass (or even more), we use the convective-polytrope approximation. 
This implies that the energy transport from the centre to outer layers 
of the star is through convection and the index is n — 3. In such a case 
e — 1 is constant in the star and small compared with unity. Writing it as 
P /2, after some manipulation we get the above equation in the following 
form: 


B 3 GM G 2 M 2 

— = -p -5.1-=- 

M R b R 2 


( 12 . 20 ) 


Suppose we express the right-hand side of this equation in terms of 
the Schwarzschild radius R s — 2GM introduced in Chapter 9: 


B 

M 


3 R s 
-P — 

% R b 



( 12 . 21 ) 


How high is the average density of the object? A short calculation 
leads to the result 


(p) = 1.8 x 10 16 




( 12 . 22 ) 


Thus, for a star with mass 10 8 M q , we have a density comparable to that 
of water even for R b ~ R s . Also, from Equation (12.21) we see that the 
post-Newtonian term is comparable to the Newtonian one in magnitude 
when 


3 

8 


P 


Rs 

Rb 



that is, when R b = 0.35/3~ l R s . The parameter ft is estimated to be 
about 10 -3 for stars as massive as those with M — 10 8 M Q . Then our 
calculation above tells us that, for R b = 350 R s , the general relativistic 
contribution becomes significant. In short, one does not have to wait 
until R b becomes comparable to R s for general relativity to become 
relevant. 
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It is convenient to use the central temperature T c instead of the stellar 
radius and we refer to [24] for the polytropic-model formula: 



(12.23) 


For then = 3 polytrope considered here, the formula (12.21) for the 
binding energy becomes 



(12.24) 


Note that the binding energy has a maximum at a temperature 



(12.25) 


Figure 12.2 shows the variation of binding energy with T c . The 
binding energy becomes zero at 7]° = 5 x 10 13 (M Q /M)K and then 
turns negative as the temperature is further increased. This means that 
stars centrally hotter than this value need to have an energy source to 
keep an overall positive energy. 

Normally about 10 -3 M energy is available for nuclear fusion, pro¬ 
vided that the central temperature is adequate to trigger these reactions. 
Assuming that the second term of (12.24) dominates at high enough T c , 


Fig. 12.2. Variation of 
binding energy per unit mass 
of supermassive stars, with 
their central temperatures. 
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the energy requirement is 

B ,, / M \ , , 

m^ 3 - 3x1 ° Uj rc - 10 ° 2 ' 26) 

The central temperature is as high as 8 x 10 7 K at the densities 
involved in the central region. On setting this value in the above relation, 
we get the upper limit on the mass: 

M<~10 8 M o . (12.27) 

What does this mean? For supermassive objects with masses greater 
than this value, nuclear reactions are not able to provide sufficient energy 
for hydrostatic support. Even for stars of masses below the above limit, 
when the nuclear reactions have exhausted all nuclear fuel, they too can 
no longer sustain equilibrium. 

The equilibrium configurations need to be analysed from the stability 
point of view before they can be considered credible. S. Chandrasekhar 
carried out the small-oscillations analysis near the equilibrium states. He 
found that the models become unstable long before R\, becomes close 
to R s (vide Reference [28]). 

Figure 12.2 demonstrates why it is more difficult for a supermassive 
star to remain in equilibrium in the post-Newtonian relativistic regime. 
If there is an inadequate supply of energy, the star cannot maintain 
equilibrium and will start shrinking. As the central temperature rises 
with shrinking of the star because of its gravity, its binding energy sinks 
further and this makes it more difficult for the star to supply the required 
energy. So the star shrinks even faster. This leads to what is commonly 
called gravitational collapse. 

Thus we see that supermassive stars take us to regions of strong 
gravity, where general relativity will apply in full. As seen here, we first 
encounter the post-Newtonian phase wherein additional terms are taken 
from relativity to supplement the Newtonian discussion. If we follow 
the path of gravitational collapse we should encounter even stronger 
gravitational fields and would need not a post-Newtonian approximation 
but the full use of general relativity. We will consider this phase in the 
following chapter. 

As part of relativistic astrophysics we must also consider the effect 
of gravity on light. That gravity affects light was demonstrated by the 
eclipse experiment of Chapter 10. How does the interaction manifest 
itself in nature? 


12.4 The first gravitational lens 

The bending of light rays due to the gravity of a massive object gives 
rise to a variety of phenomena now known as gravitational lensing. The 
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ABA' B' 



Observer 


Fig. 12.3. The magnification 
produced by gravitational 
lensing is shown in the figure 
as the increase in the linear 
separation of a radio source 
from AB to A' B'. Very-long- 
baseline interferometry (VLBI) 
of quasars shows A' B' increas¬ 
ing at speeds many times the 
speed of light. In reality the 
source AB may not increase at 
superluminal rate, however. 


lensing is caused by gravity rather than by the refraction of light passing 
through an inhomogeneous medium. We will discuss this topic next, 
since it has played a major role in astronomy since 1979. 

A brief historical description of the path that led in 1979 to the 
discovery of the first gravitational lens involving quasars may be in 
order. 

The first paper [29] on the subject, entitled ‘Nebulae as gravitational 
lenses’, was published by Fritz Zwicky in 1937. He clearly stressed the 
role of galaxies as light-deflecting objects that could produce multiple 
images of background sources. He pointed out the possibility of ring- 
shaped images, of flux amplification and of the use of this phenomenon 
for understanding the large-scale structure of the Universe [29], Zwicky 
was ahead of his time in assuming that the nebulae (i.e., galaxies) would 
be several hundred billion times as massive as the Sun and, in another 
paper, he also estimated the probability of lensing occurring in extra- 
galactic astronomy [30]. 

In the 1960s and 1970s, S. Refsdal, J. M. Barnothy, R. K. Sachs, 
R. Kantowski, C. C. Dyer, R. C. Roeder, N. Sanitt and several others 
published papers highlighting various aspects of gravitational lensing, 
ranging from purely theoretical investigations in general relativity to 
observational predictions in astronomy. We refer the reader to the book 
by Schneider, Ehlers and Falco [31] for further details. 

In a different context, Chitre and Narlikar [32] invoked the gravita¬ 
tional bending of radio waves from the VLBI components of a quasar 
by an intervening galaxy to explain the apparent superluminal separa¬ 
tion of these components. If the galaxy is suitably located (i.e., close 
to the critical point of the lensing system) the apparent magnification 
of the separation between two components due to the lensing can be 
enormous and can convert a real subluminal speed into a superluminal 
one (see Figure 12.3). (However, the generally accepted interpretation 
of superluminal motion involves relativistic beaming.) 

The quasars, galaxies etc. entering our discussion in the rest of 
this chapter are far-away objects located well beyond our Galaxy. Such 
objects, as will be discussed in detail in Chapters 14-17, participate in 
the expansion of the Universe. One important consequence of this is that 
their spectra show redshift that increases in proportion to their distance 
from us. Thus, in what follows, we will have occasion to refer to such 
cosmological redshifts as indicators of distance. The reader unfamiliar 
with cosmology may wish to familiarize himself with cosmological 
redshifts by taking a quick look at Chapters 14 and 15. 

The real stimulus to the work on gravitational lensing came from 
the discovery of the first lens involving the quasars 0957+561 A and B 
by Walsh, Carswell and Weymann [33]. The quasars A and B showed 
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Fig. 12.4. A photograph of 
the twin quasar images 
0957+561 A and B in which 
the image of one component 
is 'subtracted' from the other. 
The balance reveals the image 
of a galaxy, G, that is believed 
to have acted as a lens 
producing the two images of 
a single source. (From 
Stockton A. 1980, Ap. /., Part 
2, Letters to the Editor, 242, 
pp. LI 41-LI 44.) 


very similar features and spectra at a redshift of ~ 1.4. Their angular 
separation was ~6 arcsec. Although the existence of two quasars with 
very similar features at close separation cannot be ruled out, the cir¬ 
cumstantial evidence indicated a gravitational lens doubly imaging one 
source. The discovery of a lensing galaxy at a redshift of ~ 0.36 later 
lent further credibility to this scenario. The quasars and lensing galaxy 
are shown in Figure 12.4, while a ray diagram of the bending of light by 
the lens is shown in Figure 12.5. 

The basic features of a gravitational-lens system are described in 
the following section. By now there are several known lens systems and 
probable candidates, as listed in Table 12.1. The original expectations 
of Zwicky have been fully borne out. 


12.5 The basic features of a gravitational lens 

Figure 12.6 is a schematic diagram of a lens system wherein S is the 
source, a spherical mass M provides the deflector lens d and O is the 
observer. (We deplore the notation of denoting the lens by d as it is 


A 



Fig. 12.5. The ray diagram 
showing how gravitational 
bending by lensing a galaxy 
can produce two images A 
and B of a single source Q. 
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Table 12.1. ^4 partial list of interesting cases of gravitational lenses 


System 

No. of images 

Lens redshift 

Image redshift 

Maximum separation 
(arcsec)“ -6 

Quasar images 





0957+561 

2 

0.36 

1.41 

6.1 

0142-100 

2 

0.49 

2.72 

2.2 

0023+171 

3 

? 

0.946 

5.9 

2016+112 

3 

1.01 

3.27 

3.8 

0414+053 

4 

0.468 c 

2.63 

3.0 

1115+080 

4 

0.29 

1.722 

2.3 

1413+117 

4 

1.4 C 

2.55 

1.1 

2237+0305 

4 

0.0394 

1.695 

1.8 

Arcs 





Abell 370 


0.374 

0.725 


Abell 545 


0.154 

? 


Abell 963 


0.206 

0.77 


Abell 2390 


0.231 

0.913 


Abell 2218 


0.171 

0.702 


C10024+16 


0.391 

0.9 


C10302+17 


0.42 

0.9 


C10500—24 


0.316 

0.913 


C12244—02 


0.331 

2.237 


Rings 





MG1131+0456 


? 

7 

2.2 

0218+357 


? 

7 

0.3 

MG1549+3047 


0.11 

? 

1.8 

MG1654+1346 


0.25 

1.75 

2.1 

1830-211 


? 

7 

1.0 


a For arcs the maximum separation is the diameter of the corresponding Einstein ring. 
b For rings this corresponds to the diameter of the ring. 
c Assumed or still to be confirmed. 


a symbol normally reserved for distance, but adopt it here because it 
has become common in the gravitational-lens literature.) The distance 
between the source and the observer is denoted by D s , that between the 
source and the lens by D ds and that between the lens and the observer 
by D d . 

The condition that the ray from the source passing outside the deflec¬ 
tor at a distance f reaches the observer as shown in Figure 12.6 is given 



12.5 The basic features of a gravitational lens 189 



Fig. 12.6. A schematic 
diagram of a gravitational 
lens. For details, refer to the 
text. 


by the rules of projection: 

D s 2 r s 

PD S = ■ (12.28) 

Mi $ 

Here = 2 GM/c 2 is the Schwarzschild radius of the deflector mass. 
We have tacitly assumed that the gravitational bending of light is small 
and so the angles ft and a (the bending angle of the original ray) are 
both small compared with unity. Also, when applying this relation over 
cosmological distances, we have to take due note of the non-Euclidean 
measures of redshift-related distance. Thus in general Z)d S ^ D s — D^. 
The deflected ray in Figure 12.6 makes an angle 0 with the line OM. 
Hence, with our small-angle approximation, 9 D& = §. Therefore the 
above equation becomes 


2;'s Ais , 

p = 0 - 5—(12.29) 

This equation can be generalized to a full three-dimensional case in 
which the vectors /3 and 9 lie in different planes. We will continue with 
the two-dimensional simplification. 

It is convenient to define an angle ao and a length by 


2 r s D ds 
DiD s 


«o = 


fo — UoAi- 


(12.30) 
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With these definitions the basic equation (12.29) reduces to the 
quadratic 


e 2 - pe - a 2 = 0. (12.31) 

This tells us that there are two roots, i.e., there are two possible 
locations of images, whose angular separation is given by 

A6> = \J 4ffQ + ~jp. (12.32) 

Note that the two values of the roots 0\ and 62 are of opposite signs, 
implying that the two images are located on the opposite sides of the 
source. Note also that, if the source, lens and observer are collinear, then 
the angle /) = 0 and there is no preferred plane for the rays to take. The 
geometry is then axisymmetric about the line SO. Thus we get a ring¬ 
like distribution of images with an angular separation from the source 
of 6 = u 0 . What we have described is, of course, a highly symmetric 
situation involving a symmetric matter distribution, a point source and a 
special alignment. In practice these conditions are not fully satisfied, but 
we may still get approximately ring-shaped images of extended sources, 
which are called Einstein rings in the literature. 

A general lens is more difficult to quantify. However, we may make 
a few statements that can be proved using detailed mathematics, which 
the reader may look up in books or monographs specializing in grav¬ 
itational lensing, e.g.. Reference [31]. A general theorem proves that 
any transparent matter distribution with a finite total mass and weak 
gravitationalfield produces an odd number of images. Sometimes a few 
images are too faint to be seen, and we see an even number of images. In 
our simple case described above, we get two images. In this case, if the 
gravitating object is a point mass (black hole), the possibility exists that 
the incident rays would have small impact parameters, thus violating the 
weak-field condition. If the lens were a transparent sphere of matter, the 
theorem would still apply. 


12.6 The magnification and amplification 
of images 

Consider a simple example in Euclidean geometry of a spherical source 
of radius a and luminosity L located at a distance D. The flux of radiation 
received from the source is given by 

L 

^ 4n D 2 ' 


(12.33) 
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while the solid angle subtended by the source at the observer is 

_„2 


a, = 


D 2 ' 


The surface brightness of the source is given by 

L 


47r a 2 

It is easy to verify from these results that 

/ = Q.cr/n. 


(12.34) 


(12.35) 


(12.36) 


In other words, the surface brightness is proportional to the ratio of 
the flux to the solid angle subtended. Since gravitational bending does 
not introduce any additional spectral shift, we may assume this result 
to be valid for lensed sources. Since the surface brightness of a small 
source is not changed by lensing, the ratio of the flux of an image to that 
of the source (in the absence of lensing ) would simply equal the ratio of 
their solid angles: 


\jl = £ 2 /S 2 „, 


(12.37) 


the zero suffix standing for the unlensed situation. This is valid for 
cases in which the images and sources are not extended so that one may 
use a constant surface brightness. For extended sources one integrates 
over the source with suitable weighting of the surface brightness at the 
point. 

Returning to the small source, if P, the source position, is related to 
6 by a generalization of Equation (12.29) to the full three-dimensional 
case, we have the angular magnification given by the Jacobian 


f2o 

~n 


= J[P;9] = det 


■dp 
3 9 


(12.38) 


Thus the amplification of the flux is given by the reciprocal of the above 
Jacobian. 

The Jacobian has great significance in the lensing calculations. The 
parity of the image is decided by the sign of the Jacobian for that image. 
If it is positive, the sense of its curves (i.e., clockwise or anticlockwise) 
is preserved with respect to the source. For negative parity it is reversed. 
Thus regions of opposite parity are separated in the lens plane. The 
critical curves separating them are those on which the factor [x diverges. 
This is, however, an idealization since the sources are in general extended 
and infinite amplification of image brightness does not take place. These 
critical curves in the source plane are called caustics. It can be shown that 
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the number of images changes by two if and only if the source position is 
changed in such a way that it crosses a caustic. If the locations of caustics 
are known, then, for a given source position, the number of images can 
be determined using this property. Another result that is useful in this 
respect is that for any transparent distribution of matter with finite mass 
the number of images of a point source sufficiently misaligned with the 
lens is unity. 

Even though we do not in practice have enormously bright images 
(H tending to infinity), another theorem guarantees that ,for a transparent 
lens, there is one image with positive parity having an amplification 
factor not less than unity (i.e., the image is at least as bright as the 
source). Thus lensing may mislead the observer into thinking that he or 
she has found a very bright source. 


Exercises 

1. A quasar shows time variation on the scale T. Assuming that special rela¬ 
tivistic causality limits the size of its diameter, show that the maximum mass the 
quasar can have is c 3 T/(4G). Express the answer in units of solar masses when 
T is expressed in hours. 

2. Assuming an interior Schwarzschild solution to apply for the hard-core 
nucleon potential, estimate the maximum redshift from the surface of such 
an object. 

3. Explain why nuclear energy generation cannot effectively support super- 
massive stars beyond a limiting mass. Compare this limit with the way the 
Chandrasekhar limit operates for white dwarfs. 

4. Using the following definitions given by H. Bondi, derive the equations of 
Section 12.2 in the form given below: 

Definitions: 


m(r) 


w — 4irr~p(r). 


Results to prove: 


du 2 (u + w) 

dr 1—2 u 


dp 1 di> 

dr = ~2 (P + P) dr' 


dr/ 
dr 


H 

tdw 


/dw \ 

( 1 - 2 u) (df- a ) 

where 

H = 2w — (u 2 + 6uw + w 2 ), 


P = 


u t dw 
4nr 2 \ d// 


')(£-)' 


u + w 
1-2 id 


P = ~ 


w z — 5u — w 
u 1 — 2 u 
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5. Show that in the interior Schwarzschild solution the central redshift z c is 
related to the surface redshift z s by the relation 

2 

1 + -c — ~-(1 + Zs). 

2 - z s 

As z s —»■ 2, z c —>■ oo. 

6. Show that the interior Schwarzschild solution is conformally flat. 

7. Show that a star located exactly along the line of sight from the Earth to 
the Sun can under certain circumstances be seen as an Einstein ring of radius 
4 GM e /(c 2 R Q ). 



Chapter 13 

Black holes 


13.1 Introduction 

We found in the previous chapter that, if a massive star runs out of 
nuclear fuel, it would lose its equilibrium and begin to shrink. Even 
when nuclear fuel is available to the star, it may be insufficient to meet 
the demands for the star’s equilibrium. In the early 1930s the young 
astrophysicist Subrahmanyan Chandrasekhar had encountered a some¬ 
what similar situation when discussing the state of stars like the Sun, 
after they run out of their nuclear fuel. He found that the star can still 
sustain equilibrium if its internal matter can attain the degenerate state. 
Degeneracy can arise if the density of matter is so high that all available 
energy levels of atoms are filled up, up to some low energy. In such a 
situation further compression of matter is not possible and gravity is held 
at bay. This is an excellent example of a macroscopic effect of quantum 
mechanics: a star as massive as the Sun feels an effect whose origin is 
in quantum mechanics. We cannot describe it in detail since that would 
take us farther away from our present interest. 

The early work on degenerate matter by R. H. Fowler had shown 
that eveiy star on sufficient compression attains degeneracy, thereby 
ensuring that the star would rest in peace in a state of very high density 
and small radius. It was felt that white dwarf stars are precisely the 
stars which are in this state. They are faint and very compact stars with 
radius typically 1% of the solar radius. 

Chandrasekhar, however, introduced a modification into the Fowler 
calculation. He noticed that, for large-mass stars, the filled levels are so 
high in energy that the electrons occupying them would be relativistic 
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and this would alter the degeneracy criteria. With his modification 
Chandrasekhar [27] found an upper limit to the mass of a star exist¬ 
ing as a white dwarf. This limit, known as the Chandrasekhar limit, is 
1 .44 Mq. This result means that stars more massive than this limit would 
have central temperatures so high that the electrons there will not have 
become degenerate. Without the degeneracy pressure the star cannot 
remain in a state of equilibrium. 

This conclusion implies that white dwarfs should not be found with 
mass greater than the Chandrasekhar limit. So far this result has held 
firm. The concept of relativistic degeneracy has become accepted and 
astrophysicists have extended it to the neutron stars also. There neu¬ 
trons are tightly packed in a small volume and become degenerate, 
again provided that the mass of the star does not exceed a limit close 
to 2 Mq. 

Nevertheless, in the early days Chandrasekhar faced considerable 
opposition to his result. His main opponent was no less a person than 
Arthur Stanley Eddington, who had played a pioneering role in stellar 
astrophysics. Eddington castigated Chandrasekhar’s use of relativistic 
degeneracy in the following words: 

[If Chandrasekhar is right, then ... 7 the star has to go on radiating and 
radiating until, I suppose, it gets down to a few km. radius, when gravity 
becomes strong enough to hold in the radiation, and the star can at last 
find peace [... ] I think there should be a law of Nature to prevent a star 
from behaving in this absurd way... 

Eddington was visualizing a situation in which a star finds itself 
without any counterforce to its own gravity, which makes it con¬ 
tract and continue to contract with ever increasing force. For, with 
its ‘inverse-square behaviour’, gravity grows stronger as the object 
shrinks, with the result that the contraction enters a run-away mode. 
This phenomenon is called gravitational collapse. It is ironic that the 
reductio ad absurdum- type argument used by Eddington against the 
continued contraction of a massive star can be turned round to pre¬ 
dict the existence of a new genre of objects. An object of this type 
develops a gravitational force so strong that it pulls back even the 
light originating in the object, thus rendering it invisible to external 
observers. 

Such an object is called a black hole today. We will spend this 
chapter summarizing the properties of black holes within the framework 
of general relativity. We begin with a discussion of gravitational col¬ 
lapse, the phenomenon that is supposed to lead to the formation of a 
black hole. 
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13.2 Gravitational collapse 

Before taking up the general relativistic problem, we briefly outline 
its Newtonian counterpart. For we will find it of interest to compare 
and contrast the descriptions of the phenomenon by these two leading 
theories of gravity. 


13.2.1 The Newtonian problem 

Let us consider a ball of matter of mass M and radius R having a 
spherical symmetry in its physical parameters such as density p and 
pressure p. Suppose that it is undergoing a gravitational collapse, i.e., 
continued and ever-increasingly fast contraction. We ignore the effect of 
pressure as an opposing agency to gravitation and write the equation of 
motion as 



This equation represents the acceleration of a test particle on the 
surface of the ball, and it can be easily solved with the initial conditions 
R — Rq, R = 0 at t — 0. We find that the time taken for R to reach 
zero is 



JL 

2 CM' 


(13.2) 


For a Sun-like star this works out at 29 minutes! The short time scale 
indicates how powerful the collapse phenomenon can be. The above time 
scale can be written in terms of the starting density as follows: 


to 


tc 8j r Gp 0 



(13.3) 


We will now look at the relativistic problem, which leads to a sur¬ 
prisingly similar answer. 


Example 13.2.1 Problem. In the above example let m(r) denote the mass 
of the ball within radius r. Imagine the star as made of layers of different 
densities and find the condition that the layers do not cross as the ball 
collapses. 

Solution. From (13.2) we see that the collapse time for m(r) is 

2 V 2 Gm(r) 

where we assume that initially the value of r was r 0 . For ‘no crossing’, 
m(r) = m(r 0 ) and we expect t to be an increasing function of r 0 . (Thus the 
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outer layers collapse later.) So the condition is that r 3 /m(r) should increase 
with r. On defining by p(r) the average density of m(r), we have 

r 3 3 

— = -T-p{r)~K 
in(r) 47r 

Hence our requirement for no crossing is that p(r) should decrease with 
r; i.e., there is no density inversion in the ball. The density should steadily 
decrease outwards. 


1 3.2.2 The general relativistic problem 

We begin with a discussion of spherically symmetric collapse since this 
is the only case that has been dealt with exactly. The line element for the 
spacetime is given by 

ds’ 2 = e u d t 1 - e" dr 2 - e"(d<9 2 + sin 2 <9 Acjr). (13.4) 

Here v, a> and p are functions of r and t. The energy-momentum tensor 
for the ball made of perfect fluid will be as given in (7.26): 

T ik = (p + p)u‘u k - pg ik . (13.5) 

The energy-conservation relations T’*# = 0 then lead to two 
equations: 


and 


2 

co + 2/2 =- p 

P + p 


(13.6) 


3v 2 dp 

dr p + p dr 


(13.7) 


We next simplify the problem by assuming that pressures are unim¬ 
portant during collapse. Thus, ignoring the pressure gradient in Equation 
(13.7), we get v independent of r and therefore a function of t only. A 
time/time transformation can then be used to set v — 0. We have used 
this trick before. Ignoring pressure reduces the problem to that of ‘dust’ 
and allows the coordinates to be given a ‘comoving’ interpretation. Thus 
we assume that a comoving observer falling in with the collapsing ball 
has constant coordinate values for r, 9 and cp. Thus such an observer has 
t as his proper time. 

We next consider the field equations and look at the Ru component. 
It reduces to the equation 


2 p! + pp! — cop' = 0 . 


( 13 . 8 ) 
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This integrates to 


eV 2 


(13.9) 


4(1 +g) 

where g is an arbitrary function of r only. Next, the (1, 1) component of 
the field equations becomes 


1 


g' 2 e-' 1 ‘ - /i + -/l 2 - e"" = 0. 


Using (13.9) this can be integrated to 

(i 2 = 4 g(r)e“" + 4F(r)e“ 3M/2 . 


(13.10) 


(13.11) 


Here F(r ) is a function of r only. Finally, from Equations (13.9) and the 
(0, 0) component of the field equations, we find that 

Ayr 

F'(r)= — Gp^ /2 p’. (13.12) 

Now suppose that the dust ball was of uniform density po at t = 0 
and that it was at rest then. Thus we assume that at that initial moment 
the time derivatives of w and /x were zero. We can still choose the r 
coordinate and do so at that initial moment by requiring that a sphere of 
constant r has the surface area 4tu~ 2 . This requirement leads to 

e M0 ,r) = r 2 _ (13.13) 


We will specify the extent of the collapsing mass by requiring that it 
is limited by r < r b. For r > r b we may assume that the space is empty 
and describable by the Schwarzschild solution. 

By applying (13.12) to the situation at / = 0, we get 


F(r) = 


8jrGp 0 

3 


(13.14) 


say. There is an arbitrary constant of integration that corresponds to a 
point mass at r — 0, which we set equal to zero. Also, from Equation 
(13.11), at t — 0, fi — 0 we get 


g(r) = —F(r) = -ar 2 . 
r 

For t > 0 we get a solution for by writing 
e' 2 = rS(t) 


(13.15) 


(13.16) 


and using (13.11), (13.14) and (13.15), we get 

S 2 = a^-^y (13.17) 

The initial conditions are SXO) = 1, 5(0) = 0. A comparison with 
the Newtonian problem that we briefly looked at earlier shows that, 
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Fig. 13 . 1 . A plot of the scale 
factor 5(t) of a supermassive 
dust ball undergoing 
gravitational collapse. The 
time t s when the outer surface 
of the ball crosses the 
Schwarzschild radius is shown 
by a dotted line. 


had we written there R = S(t)Ro, we would have got exactly the same 
equation for S(t) as (13.17). The solution also matches the Newtonian 
one when we get the time of collapse to S = 0 as 


71 



(13.18) 


Equation (13.9) determines the function e" in terms of the above 
quantities as 


e 


Oi 


s 2 (0 

1 — ar 2 


(13.19) 


The line element inside the dust ball thus takes the form 


ds 2 = dr 2 - S 2 (t) 


Ar 2 

1 — ar 2 


+ r 2 (d0 2 + sin 2 # dd> 2 ) 


(13.20) 


We have encountered this line element in Chapter 6 as an example 
of a maximally symmetric space of three spatial dimensions. We will 
encounter it again in Chapter 14 as a cosmological spacetime metric. 

Figure 13.1 shows the function S(t) plotted between/ = Oandf = to. 
We can match it to an exterior Schwarzschild solution for a mass 


M = 4nr^p 0 /3. (13.21) 

For, at t — 0, the proper radius of the ball is and its density is 
po. As the ball contracts by a factor S(t), its density goes up by a factor 
S(t)~ 3 , compensating for the reduction of its proper volume. Thus its 
mass remains the same during collapse. 
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For an external observer at a constant radial Schwarzschild coordi¬ 
nate, the collapsing object would have an effective radius R\, = r\,S{t). 
It will become equal to the Schwarzschild radius when S = ar This 
radius is crossed in a finite time as measured on the t-scale. This is when 
the object has become a black hole. 

At this stage, we would like to emphasize that the technique used 
here to solve the collapse problem is the one used by B. Datt [34] 
from Kolkata in 1938. The same problem was solved a year later by 
Oppenheimer and Snyder [35]. Because it was published in a higher- 
profile journal the latter work received greater publicity than did the 
earlier one by Datt. Thus the collapse problem is commonly called 
the ‘Oppenheimer-Snyder problem’. We will refer to it as the ‘Datt- 
Oppenheimer-Snyder problem’ or simply the ‘DOS problem’. 


1 3.2.3 Collapse viewed from outside 

Let us look at the DOS problem from the vantage point of the external 
Schwarzschild observer. The line element outside the object is, of course, 


ds 2 


1 - 


2GM 

R J 


dT~ 


d R 2 


2 GM\ 

'““R - ) 


R 2 (d9 2 + sin 2 d d ijr). 


(13.22) 


Note that we have departed from our earlier notation by changing 
the coordinates ( t , r) to (T, R ) since we want to relate this line element 
to the one considered for the collapsing dust ball. Thus the line element 
(13.22) has to be matched to the line element (13.20) at the boundary 
r = >’b of the collapsing object. So we need at the boundary the condition 


R = r h S{t). 


(13.23) 


Next consider a test particle at the boundary. It is falling freely and so 
follows a timelike geodesic. From our general solution of the geodesics 
in Schwarzschild’s spacetime we use Equation (9.27) to deduce that for 
radial free fall 


dr / 2 GM\ 
d7 R~) 


— constant. 


(13.24) 


However, from the same geodesic equation for spacetime given by 
the line element (13.20) we have 

d t 
d s 


(13.25) 
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Hence on the surface we must have 


dr / 2 GM\ 

~di \ Y~) 


= constant = y 


(13.26) 


(say). Since the line element is a first integral of the geodesic equations, 
we use Equation (13.22) together with the above to get 


d R\ 2 , 2 GM 

w) = r ~' + — 


(13.27) 


On the boundary, R = r\,S(t), this equation becomes 


( ds Y _ V 2 ~ 1 2GM 

V d/ J r l + rfsuy 


(13.28) 


A comparison with Equations (13.17) and (13.21) gives 


2 GM = arl , y 2 = 1 - c/r 2 . (13.29) 


We had arrived at this value of M from earlier discussions of gravitational 
collapse. 

Consider now a radially outward light signal from an observer B on 
the boundary of the body at R — R\ leaving him at time 7) and reaching 
another Schwarzschild observer A located at fixed R — R 2 at time 7). 
Notice that R\ decreases as the object collapses. 

The radial null geodesic equation then yields 

pR 2 

T 2 -Ti= (1 - 2 GMR)~ l d R (13.30) 

Jr 1 

This integral diverges as Ri -> 2GM (— R s ). Let, at t — t s , R s — 
r\)S(t s ). Then, as t —> 4 , T\ -> 00 . In short, the observer A at Ri has to 
wait for ever for the signal sent out by B at 4 . 

Figure 13.2 shows the signal propagation from B to A on the con¬ 
tracting object. If A has a means of measuring wavelength, he will 
find that light waves from B are increasingly redshifted. The shift is 
gravitational as well as Doppler. 

We may liken the signal exchanges between A and B to a correspon¬ 
dence between an Applicant and a Bureaucrat. The applicant may think 
that the bureaucrat is being very dilatory... but this is really the trick 
played by curved spacetime! The time flows at different rates for the two 
protagonists. 

It is clear that as B approaches R s his signals begin to be more and 
more difficult to receive, both because of their redshift and owing to 
their faintness. 
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Fig. 13.2. A schematic 
diagram showing signal 
exchanges between observers 
A and B. For details, refer to 
the text. 
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Example 13.2.2 Problem. Show that the time measured by the comoving 
observer riding on the surface of the gravitationally collapsing dust ball 
registers a finite value as he crosses the Schwarzschild barrier. 

Solution. Using the calculations in the text, we find that the time measured 
by B is given by t and Equation (13.17) describes the rate of collapse. The 
Schwarzschild barrier is crossed when S = ar\. 

To solve (13.17) write S = cos * 2 #. Then Equation (13.17) becomes 

2 cos 2 # # = a /a. 


This integrates to 

# + sin # cos # = y/at 

so that, at t = 0, # = 0 and S = 1. The state S = 0 is reached when # = n/2. 
At the Schwarzschild barrier the observer B will have cos 2 # = otr£ so that 
the corresponding time is 


This is 


ts = —j= (cos 1 \/^b + V ar l (! 
s/d V 


finite and less than n / ( 2^/a ). 
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Problem. Estimate a and r^ for a homogeneous dust ball with the Sun’s mass 
and radius as the starting values for the DOS problem. 

Solution. For the Sun M 0 = 2 x 10 33 g and R Q = 7 x 10 10 cm are the mass 
and radius. Sor b = R Q = 7 x 10 10 cm. 

The starting density is po = 3M 0 /(47ri?|) = 1.4 g cm -3 . The formula 
for a then gives 

SjtpoG i a—7 —2 

a = - = 7.7 x 10 s . 


The barrier at R s is thus a one-way membrane. Having crossed 
it, B will continue to receive signals from A, but his own mes¬ 
sages will not be able to cross the barrier, let alone reach A. This 
Schwarzschild barrier marks the boundary of what is called a black 
hole. It is also usual to refer to the boundary R = R s as the ‘event 
horizon’. We will refer to this aspect more specifically later in this 
chapter. 

The external observer does not see what happened to B after B has 
crossed this barrier. B’s fate is not very pleasant. Besides any tidal effects 
that may tear him apart, at t = t 0 he hits the state described by S — 0. 
This is a state wherein the entire space ‘shrinks’ to zero volume with 
the density going to infinity. Parameters describing spacetime geometry 
diverge and there is no way of describing what is happening mathe¬ 
matically or physically. This extreme state of space, time and matter is 
called singularity. We will refer back to this strange aspect of spacetime 
geometry in Chapter 18. 


13.3 The Schwarzschild solution in other 
coordinate systems 

The line element (13.22) is somewhat inconvenient for discussing 
regions containing the Schwarzschild barrier. For example, we find that 
the metric components goo and gi i become respectively zero and infinity 
at R — 2GM. Also, inside the barrier the coordinates T, R interchange 
their time/spacelike character. It is sometimes preferable to use other 
coordinates, which behave normally in this region. We have already 
seen that the comoving coordinates used for discussing collapse do not 
throw up any problem at R = R s . However, they are not so convenient 
for relating to an external observer. We describe some other coordinate 
systems that have been found useful for connecting across the barrier 
at R — R s . 
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IB.3.1 Eddington coordinates 

In these coordinates, the Schwarzschild coordinate T is replaced by the 
‘null’ coordinate 


V = T + R + 2 GM In 


R 

2 GM 



(13.31) 


It can be seen via a simple manipulation that V — Vq (constant) describes 
a null geodesic corresponding to a radial null ray going in from outside. 
Also, the line element (13.22) is transformed to 


2 GM \ o 0 9 9 9 

1-— ) dV 2 -2dVdR- R 2 (d9 2 + sin 2 # d<j> 2 ). (13.32) 

This coordinate was used by Eddington [36] to connect observers 
A and B, A outside the barrier and B inside, with light rays coming 
from A to B. 



1 3.3.2 Kruskal-Szekeres coordinates 

These coordinates were independently discovered in 1960 by M. D. 
Kruskal and G. Szekeres [37, 38]. We give the transformations from 
the Schwarzschild coordinates below. It will be clear that they carry the 
Eddington coordinates a step further in using null tracks. 

The coordinate system involves a changeover from T, R coordi¬ 
nates to u , v coordinates while leaving the other two coordinates 6, <p 
unchanged. The transformations relate to four different but connected 
regions of spacetime, which we will denote by I, II, III and IV Briefly, 
we have the following. 


Region I: R > 2 GM, u > 0: 


R 

2 GM 
R 

2GM 

Region II: R < 2GM, v > 0: 

1/2 


1/2 

— 1 ) exp 

1/2 

— 1 ) exp 


u — I 1 — 


R 


2 GM 


exp 


v = 1 — 


R 

2GM 


1/2 


exp 


R 

4GM 

R 

AGM 


R 

AGM 

R 

AGM 


cosh 


sinh 


T \ 


AGM 
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T 

AGM 


sinh 


cosh 


AGM / ’ 


T \ 

AGM 
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Region III: R > 2GM, u < 0: 


R 


2GM 

R 


- 1 


1/2 


exp 


- 1 


1/2 


2GM 

Region IV: R < 2GM, v < 0: 

1/2 


exp 


- I cosh f ——— ^ . 

ACM J \AGM ) 

T \ 

- I siniil - . 

AGM J \AGM J 


u — — [ 1 — 


R 


2GM 


exp 


T \ 

AGM) ” lnl \4GM ) 


v = — 1 — 


R 


2GM 


1/2 


exp 


R 


AGM 


cosh 


AGM 


For all four regions, the line element is, however, the same: 
32 G 3 M 3 ( R 


ds = 


R 


exp - 


2 GM 


(d u — dir) — R (dO + sin 0 


d 4> 2 )- 
(13.33) 


The coordinates are thus u,v,0,(j> and we may look upon R in the above 
line element as a function of (u, v) given implicitly by 

(—— -l^jexpf———^ = u 2 — v 2 . (13.34) 

\2GM ) V \2GM) v ’ 

As can be seen from Equation (13.33), the line element is well 
behaved at R — R s . Figure 13.3 is the so-called Kruskal-Szekeres dia¬ 
gram showing a radial constant 6, cp section with some important lines 
marked. The lines R — constant are rectangular hyperbolae in the u-v 
plane. The lines R — 2GM form a cross with one arm (SW to NE) 
having T — oo while the other arm (SE to NW) has T = —oo. The four 
regions I, II, III and IV are within the sectors defined by the radial lines 
u 2 — v 2 — 0. 

The collapsing object has the outer boundary shown in Figure 13.3 
by a dotted line. The collapse starts in Region I and ends in Region II. 
The boundary point hits the singularity shown by the hyperbola R — 0. 
The same trajectory can be continued as in the figure from Region 
1 to Region IV What does it represent? It represents the time-reversed 
version of collapse: an eruption out of the singularity in Region IV, which 
is sometimes called a white hole. We will describe a white hole later. 

The Kruskal-Szekeres diagram demonstrates the incompleteness of 
the Schwarzschild coordinate system, besides explaining the difference 
between a black hole and a white hole. 
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Fig. 13.3. The 

Kruskal-Szekeres diagram 
relates the null coordinates 
( u , v ) to the Schwarzschild 
coordinates (R, T) and shows 
the incompleteness of the 
latter. 


IV 


13.4 Non-spherical gravitational collapse 

The only problem of gravitational collapse we could discuss in some 
detail was one of spherical symmetry. One could argue that stars are by 
and large spherical and will start their collapse in that state. Although 
they would have pressures, these may be neglected in the ‘run-away ’ state 
of collapse, as mentioned earlier. The formalism developed by Datt can 
deal with any inhomogeneous (but spherically symmetric) initial state. 

Nature, however, might not be so obliging always in giving us spher¬ 
ical symmetry as an initial condition. The departure from spherical sym¬ 
metry complicates the problem enormously. Attempts are being made 
to solve the partial differential equations of collapse numerically on a 
large computer. 





1 3.5 The Reissner-Nordstrom solution 207 


Nevertheless, if one sticks to ‘small’ departures from spherical sym¬ 
metry, one can make progress, as R. H. Price’s work has shown. The spirit 
of this work (not the details) is conveyed in the following paragraphs of 
this section. 

Suppose the collapsing body generates an external physical field 
O, which can be characterized by an integral spin s and zero rest mass 
(so as to be of long range). Thus an electromagnetic disturbance is 
characterized by s = 1, whereas a gravitational one has s = 2. These 
disturbances may be treated as small, first-order, perturbations on an 
external Schwarzschild solution, which is assumed to be left unchanged 
at ‘zeroth order’. 

Suppose the external field is written as a power-series expansion 
over spherical harmonics: 

* = ^ 'S,(9, cj>). (13.35) 

n,s 

The coefficients are functions of R and T. When the time 
development of these coefficients is carried out, most of them die away 
as the object collapses. What does remain at the end? 

Price found that all harmonics of order n > s are radiated away and 
only those with n < s remain. Thus nothing survives from a scalar 
field (s — 0 ), only the electric charge survives as a source in the 5 = 1 
electromagnetic case, while mass (n — 0 ) and angular momentum 
(n = 1 ) are left as sources in the 5 = 2 gravitational case. 

This analysis is limited to small departures from spherical symme¬ 
try. However, if we stretch our belief to larger departures away from 
spherical symmetry, it tells us that if we are limited to gravitational and 
electromagnetic interactions only (these are the only long-range basic 
interactions known today) then the end point of gravitational collapse for 
an external observer is a black hole with mass (M), electric charge ( O ) 
and angular momentum ( H ). It is therefore of interest to know whether 
such black holes exist and how they are described. 

13.5 The Reissner-Nordstrom solution 

H. Reissner in 1916 and G. Nordstrom in 1918 independently arrived at a 
solution for the metric exterior to a spherically symmetric distribution of 
charged matter with total mass M and electric charge O. (See References 
[39, 40].) We will briefly show how the problem is solved and discuss 
the nature of the solution. 

We start with a spherically symmetric line element as in the 
Schwarzschild case: 

ds 2 = e 1 ’ dT 2 - e A dR 2 - R 2 (d6 2 + sin 2 9 dtp 2 ). (13.36) 
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The only difference is that we have a Coulomb field F ik , generated 
by the charge Q at the origin, that has an energy-momentum tensor 
given by 


T,k = - r - 

47T 


F’ n F km - -F lm F lm g ik 


(13.37) 


Recall from Section 7.5.2 that this is the general expression for the 
energy-momentum tensor of the electromagnetic field. 

From the condition that the source is static, we assume that the 
solution of the electromagnetic field as well as the spacetime geometry 
will be static too. Thus we try the solution for the 4-potential as A\ = 
[\lf(R), 0, 0, 0]. Then, for the only non-zero field components Fq\ = 
—Fio, we have 

F 01 = — \jr', F 01 v / — g = e~^ k+v ^R 2 sin# x i/d- 


The condition for a point charge at the origin is that the covariant diver¬ 
gence of F lk should vanish everywhere except at R = 0. From the 
above relation 


=0 , = E_U k+v) 

dR V R 2 


where E is a constant of integration. On substituting into the expression 
(13.37) for T ,k we get the non-zero components as 


r 0 ° = Tl = -T 2 = Tl = — e^+'V 2 . (13.38) 

47T 

We now substitute — 8 tiGT{. on the right-hand side of the Ein¬ 
stein equations written out for the above metric. Referring back to the 
Schwarzschild solution of Chapter 9, we again see that the equations with 
i = k — 0 and i — k = 1 taken together give as before k' + v' = 0. As 
on that occasion, we can again simplify the solution by having k = — v. 
So we are left with only one independent equation: 


e l ’(l + Rv') 


GE 2 
R 2 ' 


(13.39) 


This simple differential equation can be solved and we get the solu¬ 
tion as 


e 


V 


B GE 2 
R + 


Flere B is a constant of integration. If we look at the asymptotic 
form of the line element at large R and demand that it look like that 
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Fig. 13.4. A schematic view 
of the Reissner-Nordstrom 
solution. It has two spheres on 
which horizon-like properties 
are found. 


for mass M at the origin, then we get B — 2GM. Likewise, if we 
ascribe a magnitude Q to the charge at the origin, then its asymptotic 
Coulomb field in Gaussian units would be Q/R 2 . A comparison with our 
solution enables us to set E — Q. Thus we have the following Reissner- 
Nordstrom line element to describe a spacetime around a spherical mass 
M and charge Q : 


ds 2 


2GM GQ 2 \ 

-ir + -^r 


2 GM 
R 


GQ 2 V l 

R 2 ) 


d R 2 


— R 2 (d6 2 + sin 2 0 dtp 2 ). 


(13.40) 


It is easy to see that there is an apparent problem where e 1 ' vanishes, 
i.e., at two values of R at 

R± = GM ± \ /G 2 M 2 - GQ 2 . (13.41) 

We have not one but two surfaces where e v vanishes. The outer one 
(R = R + ) plays effectively the role of an event horizon of the black hole 
just as the R = R s surface does for a charge-free body. See Figure 13.4 
for a schematic description of the Reissner-Nordstrom black hole. 


13.6 The Kerr solution 

In 1963 Roy Kerr obtained what may arguably be the most important 
exact solution of Einstein’s field equations since the Schwarzschild solu¬ 
tion. It describes the spacetime outside a spinning mass. Specifically, the 
line element for the empty spacetime outside a mass M having an angular 
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momentum H is given by 

ds 2 = ^(dJ 7 - h sin 2 6 > d<pf — — dR 2 — p 2 d8 2 
p 1 A 

- ^A(I ? 2 +h 2 )d<p - hdT] 2 , (13.42) 

P 2 

where we define 

h = H/M — angular momentum per unit mass as measured about the 
polar axis (9 = 0; 9 — n), 

A = R 2 — 2GMR + h 2 , 
p 2 = R 2 + h 2 cos 2 d. 

For details on how this line element was derived, see Kerr’s origi¬ 
nal paper [41]. We will look at some important properties of the Kerr 
solution. 


IB.6.1 The static limit 

As we go closer and closer to the Kerr black hole, we notice the effect 
of rotation in various ways. As we notice from the Earth, because of its 
spin, we see stars rise in the East and set in the West. If we desire to 
see the stars stationary as they really are, in the frame of reference in 
which the distant stellar background is at rest, we need to counteract the 
Earth’s spin by travelling in a fast aircraft or in a space station from East 
to West. 

The same would happen for the Kerr solution, but up to a limit. The 
coordinates (R, 9, <p) are constant for distant stars and, if an observer 
wishes to stay at rest in such a frame, he will encounter greater and 
greater difficulty as he approaches the object. He will be required to 
exert stronger and stronger force to stay in the same place relative to 
distant stars. Thus the line element shows that a world line having a 
constant (R,9,<p) will be timelike provided that 

R > R(6)=GM + \Jg 2 M 2 -h 2 cos 2 6>. (13.43) 

For R < R(9 ) the observer will be dragged along past the constant 
R, 9 , 4> framework in the direction in which the mass is spinning. Even 
if the observer employs rocket power to counter this drag, it will be to 
no avail. This surface R = R(9) is called the ‘static limit’. 



1 3.6 The Kerr solution 


1 3.6.2 The ergosphere 

Just as we have horizons for the Schwarzschild and the Reissner- 
Nordstrom black holes, here too we have an event horizon, provided, 
of course, that the object itself is not larger than it. The Kerr horizon is 
located at A = 0, i.e., at 

R = R + = GM + \Jg 1 M 1 - h 2 . (13.44) 

It can be easily verified that, as in the Schwarzschild case, light rays 
may enter the horizon from outside (R > R + ), but they cannot emerge 
outwards from inside the horizon. 

The surface signifying the static limit is larger than the above hori¬ 
zon. As shown in Figures 13.5(a) and (b), the two surfaces touch each 
other at the poles (0 = 0, 0 — jt ); otherwise the static limit is strictly 
outside the horizon. The volume between the two surfaces, shown filled 
by dots, has been named ‘the ergosphere’. The reason for the name is 
that the compulsive spin imposed on any piece of matter entering the 
ergosphere enbles us to ‘extract’ energy from the spinning black hole. 
The black hole thereby loses some of its rotational energy. The rapidly 
rotating piece will carry that energy away, if it is enabled to emerge from 
the ergosphere. 


1 3.6.3 The Kerr-Newman black hole 

Ted Newman and his colleagues at Pittsburgh University combined the 
Reissner-Nordstrom solution with the Kerr solution and generated the 
spacetime geometry for a charged spinning mass. Thus the line element 
for a mass M with angular momentum H and electric charge Q is given 
by Equation (13.42), but with the quantity A redefined as 

A = R 2 -2GMR +h 2 + GQ 2 . (13.45) 

Like the Kerr solution, this solution also exhibits the properties of 
a horizon, a static limit and the ergosphere. Further, if we set h = 0, we 
come back to the Reissner-Nordstrom black hole. This solution has one 
special significance. 

As we saw in Section 13.4, Price’s theorem indicates (but does not 
prove in the most general collapse case) that a black hole forming by 
gravitational collapse under the classical long-range forces of electro¬ 
magnetism and gravitation will at most exhibit mass, electric charge 
and angular momentum. This state is precisely described by the Kerr- 
Newman black hole with its three parameters M, H and Q. 
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STATIC LIMIT 



Fig. 13.5. In (a) we see a 
'constant-latitude' section 
of the Kerr black hole. The 
portion between the horizon 
and the static limit belongs 
to the ergosphere. In (b) the 
section along a longitude great 
circle shows the location of 
the 'poles' at 8 = 0. n and 
the section of the ergosphere 
(marked with dots). 
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We will now look at the general physical properties of black holes, 
drawing analogies with the laws of thermodynamics. While illustrating 
them quantitatively, we will use the Kerr-Newman black hole or the 
simpler Kerr black hole as the typical example of a black hole. 


13.7 Black-hole physics 

It has been noticed that the rules describing the dynamical properties of 
black holes bear a striking resemblence to the laws of thermodynamics. 

Following the analogy with the laws of thermodynamics, we begin 
by stating the first law of black-hole physics: 

In any process involving a black hole and other objects, the total energy, 

momentum, angular momentum and electric charge are conserved. 

This simply means, for example, that, if a Schwarzschild black hole 
gobbles up a mass having total energy E, its mass will grow from the 
earlier value M to M + E. 

This raises the question, can the process be reversed? That is, can 
we extract energy from the black hole and reduce its mass? The answer 
is given by the second law of black-hole physics: 

In any physical interaction, the surface area of a black hole can never 

decrease. 

The wording has a distinct similarity to the second law of thermo¬ 
dynamics, with the ‘surface area’ playing the role of entropy. Let us 
examine this law and also seek an answer to the question raised above. 

The surface area of a black hole is the area of its horizon surface. For 
the Schwarzschild black hole, the horizon is given by R = R s = 2GM, 
so the surface area of the black hole is simply 

A = 4t tR; = I 67 tG 2 M 2 . (13.46) 

With this definition, A cannot decrease and so M cannot decrease. This 
in turn implies that we cannot extract energy from this black hole since 
such a process would tend to reduce rather than increase M. However, 
not all is lost! For, if we have a spinning black hole, we note from 
Equation (13.41) that the horizon is at R — R + . From the line element 
we see that at T = constant and R = constant any surface described by 
[i 9 , 9 + d 9] x [</>, (j) + dip] has area 


p Ad \/ R\ + h 2 x - Aip = R\ + h 2 x sin 0 AO Aip. 

P 
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By integrating over 9, </>, we get the surface area of the Kerr- 
Newman black hole as 

A = 4n(R 2 + + h 2 ) = 4tt[2G 2 M 2 - GQ 2 + 2GM\J G 2 M 2 - GQ 2 - h 2 ]. 

(13.47) 


We now take the differential of the above expression, using the 
fact that the variables to change are A, M, O and H. Since H has 
directionality like a vector, we should write it as a three-dimensional 
spatial vector H. The same applies to h. Thus, after some manipulation, 
we get 


5A 

SnC 


1 

y/GHP -GQ 2 - h 2 


x [(R\ + h 2 )bM 


R+Q&Q 


It ■ 6H]. 


(13.48) 


As in thermodynamics, we assume that the most efficient way of 
running a process is to ensure that it is running reversibly, insofar as the 
area is concerned. In short, we set 6 A as zero and simplify the above 
expression to 


5 M - 


R+Qt>Q 

R 2 + + h 2 


h SH 

+ R\ + h 2 ' 


(13.49) 


Is it possible to reduce the mass of a black hole? For that is the only 
way we can extract energy from it. The above equation tells us that if we 
reduce the electric charge or the angular momentum of the black hole 
we can achieve this feat. The Penrose mechanism described below is a 
way to do this. 


1 3.7.1 The Penrose process 

The process proposed by Penrose is in fact a thought experiment 
designed to extract energy from the rotating Kerr black hole by using the 
properties of the ergosphere discussed in the text. As shown in Figure 
13.6, the process involves dropping a mass into the ergosphere, arrang¬ 
ing for it to split into two bits, with one bit falling inside the horizon and 
the other escaping outside the ergosphere. 

What happens here is that the mass entering the ergosphere is made 
to rotate along with the black hole, as discussed in the text. It therefore 
acquires energy as well as angular momentum from the black hole. 
When it splits and part of it falls into the black hole, it loses a fraction 
of the acquired energy and angular momentum back to the black hole. 
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Fig. 13.6. A schematic 
illustration of the Penrose 
process. See the text for 
details. 



The escaping portion can, however, emerge with so much energy that it 
exceeds the energy of the total original mass. 


1 3.7.2 Surface gravity 

The analogy with thermodynamics can be pushed further by comparing 
the standard relation in thermodynamics, 


dE = TdS - PdV, 


(13.50) 


with Equation (13.48) rewritten as 


6 M = k - 


6 A 
8jtG 


h SH 


R+Q&Q 


where the function k is defined by 


(13.51) 


V 7 G 2 M 2 - GQ 2 - h 2 
h 2 + R\ 


(13.52) 


What is this function supposed to represent? It is known as the 
surface gravity of the black hole. If we do a naive Newtonian calculation 
for a Schwarzschild black hole of mass M, its radius at the horizon is 
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2 GM. The Newtonian acceleration due to gravity at this distance is 


K 


G x M 
(2 GM) 2 


= 1/(4GM). 


The expression above reduces to this value for the Schwarzschild 
case. Thus k in Equation (13.51) measures surface gravity and in our 
analogy with thermodynamics it plays the role of temperature. Just as in 
equilibrium temperature is constant, we have the corresponding ‘zeroth 
law’ of black-hole physics telling us of the existence of surface gravity, 
which happens to be constant over the horizon. This law was proposed 
by Bardeen, Carter and Hawking [42], whereas the second law was 
proposed by Hawking [43]. 

The analogy with thermodynamics was carried a step further by 
Bardeen et al. when they argued that it is impossible to reduce k to zero 
by any finite set of operations. This statement matches the third law of 
thermodynamics. 

Consider the Kerr black hole. What is the maximum angular momen¬ 
tum it can have for a given mass Ml Since R + must be real, we need 
to have h < GM, i.e., H < GM 1 . The state in which H — GM 2 
describes what is called an extreme Kerr black hole. Notice that it has 
zero surface gravity and, by virtue of the theorem of Bardeen, Carter and 
Hawking, such a state cannot be attained in nature by any finite sequence 
of operations. It corresponds to the state of absolute zero temperature of 
thermodynamics. 

With the help of the laws of black-hole physics we can understand 
the limitations on the Penrose process. Note that, as we reduce the mass 
and the angular momentum of the black hole, we can at best keep its 
area A constant. In general we see that if we travel along a constant-area 
curve we finally end with the Schwarzschild black hole of zero angular 
momentum. If the third law of black-hole physics holds, then we can 
at best start this process when the Kerr black hole is in what is known 
as the extreme state (when its surface gravity is zero). Here its angular 
momentum is so large that h — GM. Let us denote by Mq the starting 
mass of the black hole in this extreme state. Then its surface area is (by 
the formula given in the text) 


, G 2 M 2 

A o = 2 ttR 2 . = 8?r- 

+ c 4 

In the final state its mass is M\, say. The area of a Schwarzschild black 
hole of this mass is 


G 2 Mf 


A, = 167r 
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By the second law of black-hole physics A , cannot be less than Aq. 
At best we may equate Aq to A u giving 


V2 

Thus the Penrose process can at most extract (M 0 — M 0 /\/2)c 2 of the 
original mass energy M 0 c 2 . The available fraction of energy is there¬ 
fore {\fl — 1)/V2, he., nearly 30% of the total energy. (In con¬ 
trast, the hydrogen-fusion process yields only ~0.7% of the total 
energy.) 

The mass M\ is known as the irreducible mass of the Kerr black 
hole. 

So, in principle, we can extract energy from a Kerr-Newman black 
hole until it reaches the state of no charge (Q — 0) and no angular 
momentum (H = 0). This state is that of the Schwarzschild black 
hole and it is characterized by its mass only. We call it irreducible 
mass since henceforth the second law prohibits any energy extraction. 
Since, in reaching this state, we have not changed the surface area 
(vide the condition 5 A = 0), we can relate the final irreducible mass 
to the area of the black hole. Thus we have the irreducible mass M x as 
given by 


i.e.. 


4 G 2 M 2 = h 2 + R\, 


M 2 



Q 2 

4GM, 


2 


H 2 

+ AG 2 Ml ' 


(13.53) 


Evidently this is the ‘most efficient scenario’! If there is any irre¬ 
versibility in the energy-extraction process, the irreducible mass would 
increase. 

We next consider the problem of interest to astronomers, viz. how 
to detect black holes. 


13.8 Detection of black holes 

Given the fact that a black hole cannot be seen by detecting any form 
of light, how does one know that a black hole is located in some spec¬ 
ified region? The answer is indicated by the following thought exper¬ 
iment. Suppose the Sun becomes a black hole. It will no longer be 
visible from the Earth. Nevertheless, we on the Earth would be able 
to deduce from the orbit of the Earth that there exists a source of 



1 3.8 Detection of black holes 


217 


attraction at the location of the Sun, with the mass of the Sun. This 
is because the black hole continues to exert gravitational force even if 
not seen. 

Following this example, an ideal scenario for the detection of a 
black hole is if it has a companion that is easily visible. For exam¬ 
ple, if the black hole is a member of a binary star system, then by 
watching its companion move we can deduce the presence of an invis¬ 
ible mass. If from the dynamics of the system we are able to place 
a lower limit on the mass of the invisible object, and it exceeds, say, 
3 Mq, then we can assert that the object is a black hole. (Vide the limits 
placed on stellar masses existing as white dwarfs or neutron stars, in 
Chapter 12.) 

The first strong case for a black hole in a binary system was that 
of Cygnus X-l, an X-ray source. (See Reference [44].) Figure 13.7 
illustrates the typical binary X-ray-source scenario. Flere we have a 
supergiant star and a black hole going round their common barycentre. 
The black hole exerts an attractive force strong enough to pull the loosely 
attached outer layers of plasma from the companion. The plasma goes 
round and round the black hole as it spirals in and ultimately falls in 
across the horizon. The infalling material, because of its viscosity, gets 
heated and radiates through the process known as bremsstrahlung. At 
the high temperature of about a million degrees, this radiation is in the 
form of X-rays. Cygnus X-l, which was first found in the mid 1970s, 
proved to be typical of several similar examples of X-ray binaries in 
which the companion was invisible. Flowever, a large fraction of these 
turned out to be neutron stars rather than black holes, since their masses 
did not exceed 2 Mq. The invisible star in Cygnus X-l, in contrast, has 
mass more than 8 M 0 . 

During the mid 1980s, observers of galaxies began reporting 
massive black holes (of (10 8 — 10 9 )M o ) in the centres of galaxies 



Fig. 13.7. The binary X-ray 
scenario which in the case of 
the X-ray source Cygnus X-1 
provides indirect evidence for 
its invisible component being 
a black hole (shown in the 
figure as a dark sphere). 
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Fig. 13.8. M87 

photographed with its nucleus 
that is believed to house a 
black hole. An indication of 
violent activity in the nucleus 
is the emergence of the jet 
seen. (Photograph by courtesy 
of NASA.) 



like M87 (see Figure 13.8). The presence of such massive black 
holes is inferred from the large dynamical activity of nearby stars 
as indicated by their large spectral shifts, or by the abnormal rise 
in luminosity of the region. The latter effect arises because of the 
concentration of stars near the black hole, which is brought about by 
its powerful attraction. The latter fact may appear paradoxical in the 
sense that the black hole itself is rendered invisible because of its strong 
attraction! 

We end this chapter with a brief description of a white hole, which 
in some sense is the opposite of a black hole. 


13.9 White holes 

We arrived at the notion of a black hole through the phenomenon of 
gravitational collapse of a dust ball. Following the 1975 analysis by 
Narlikar, Appa Rao and Dadhich [45], we now consider a time-reversed 
solution generated by changing the coordinate t to —t in Section 13.2. 
Thus the line element is 

+ r 2 (d0 2 + sin 2 d dtp 2 ) 


ds 2 = At 2 - S 2 (t) 


1 — ar 2 


(13.54) 


13.9 White holes 219 


with the function S(t) satisfying the differential equation 

5 2 = « 03.55) 

and the boundary conditions S — 0 at t = —to and S — 1, S — 0 at 
t — 0. Instead of gravitationally collapsing, the dust ball erupts as an 
explosive event at t — —t 0 . While the behaviours of the collapsing 
ball and expanding ball can be related through time symmetry in the 
/-coordinate, the two solutions look asymmetrically different from the 
vantage point of a distant Schwarzschild observer. The collapsing ball 
is seen to disappear slowly into the event horizon of the black hole. Let 
us now see how the exploding ball looks from outside. 

We consider a radial signal leaving the surface of the expanding ball 
at the Schwarzschild time 7) and reaching a distant observer at constant 
Schwarzschild R — Ri coordinate at time T 2 . The relation between these 
quantities is 


2 GM\ 

1-—J d R = T 1 -T 1 . (13.56) 

Ri is (as before) the changing value of the Schwarzschild coordinate of 
the white-hole surface. 

Next consider a signal sent out a short time A 7) later, arriving at 
the observer at T 2 + A TV During this period R 2 has not changed, but 
R\ has. Since R\ = roS(t), we can write 



A T 2 - A 7) 


r b 5(/i)A/i 

2CM' 

1 - 

Ri 


(13.57) 


From Equations (13.26) and (13.29) we can deduce that 


a r, 


v/ 1 - ar b 
2 CM 


Aq. 


(13.58) 


Therefore, we have from the above two relations the result 


A T 2 _ y/l - arl - r h S(t x ) 
~ ~ 2GM 

Ri 


(13.59) 


At R 2 2GM we may treat 7) as the proper time of the observer 
at R 2 . Flence a light wave sent from the surface of the expanding ball 
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undergoes a spectral shift z given by 


i 2 CM 

(l+zT 1 = R ' --. (13.60) 

V 1 -url -r b S(t i) 

This expression is well behaved outside the Schwarzschild horizon. 
However, it comes as a surprise to find that it is well behaved on the 
horizon and inside too! On expressing S(t\) in terms of S, we get, after 
a limiting process, 


(1+ z )- 1 



(13.61) 


The value of this ratio at the event horizon is 2 y 1 -"''li- 

On looking at the Kruskal-Szekeres diagram (Figure 13.3), we see 
that such rays are coming from Region IV into Region I. The T- 
coordinate behaves strangely, but the t and u. v coordinates are well 
behaved. The result in the above equation is finite. Since we have used 
‘bad’ coordinates ( R , T ) we have arrived at two infinite integrals whose 
difference is finite, ff we had used the (w, i>) coordinates we could have 
got the result without subtraction of infinities. 

From Equation (13.61) we see that signals not only come out from 
within the R = 2GM surface but also can be blueshifted for 

S(h) <5(1 + 0 - «'b) • (13-62) 

Thus our dust ball, at least during the early stages of expansion, 
resembles a very shiny object with high-energy radiation coming out. 
Hence such objects are called white holes. 

Compared with black holes, the white holes enjoy certain advantages 
and also suffer from disadvantages. The advantages are that they are 
readily visible, are exceptionally bright and may have an appearance 
that varies rapidly with time. The disadvantage is that a white hole as 
described here originates in a singularity and physicists are in general 
not happy with systems whose origin they cannot understand. (A major 
exception to this statement is the Universe as a whole, whose most 
popular model, the big-bang model, also originates in a singularity, vide 
Chapter 15.) One could advance a white hole as a source behind transient 
explosive phenomena like the gamma-ray bursts. A signature of white 
holes is that the frequency of their radiation declines with time: the 
factor 1 + z increases. The softening of radiation from a gamma-ray 
burst indicates just that effect. 
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Exercises 

1. By considering a test particle on the surface of the collapsing dust ball 
discussed in the text as falling freely in the external Schwarzschild spacetime, 
deduce the relation of Equation (13.17). 

2. Consider a more general solution of Equation (13.11). Define e ,l/2 = R(r , t). 
Then this equation becomes 


R 2 




Solve this by writing R = r cos 2 TT Show that the general solution includes 
cases wherein the radial displacements diverge while the transverse ones shrink. 
In all cases show that the typical proper volume element converges to zero. 

3. A collapsing dust ball emits radiation radially outwards from its surface. 
Show that as its surface approaches the Schwarzschild barrier the redshift z of the 
radiation received by a distant Schwarzschild observer using the T -coordinate 
increases as 


1 +zcx exp[r/(4GM)], 


where M is the mass of the collapsing dust ball. 

4. In the Eddington coordinates, the Schwarzschild T -coordinate is replaced by 


V = T + R + 2GM\n 


R 


2GM 


- 1 


Show that V = constant describes an ‘ingoing’ radial null geodesic and that the 
Schwarzschild line element is transformed to 


) dV 2 -2dVdR- R 2 (d9 2 + sin 2 (9d0 2 ). 

Construct the ‘outgoing’ Eddington coordinates along the same lines. 

5. Show that the following metric describes Schwarzschild’s spacetime: 


ds = ( 1 — 


2 GM 


R 


ds 2 = d? 2 - 


4 


9 GM 


1 2/3 


9 

9 GM 


2 (r - t) 


dr 2 


-i 2/3 

(r - tf (d<9 2 + sin 2 e d0 2 ). 


This metric arises on solving the exterior solution for a dust ball collapsing from 
a state of rest at infinite dispersion. 

6. Show that, in the Kerr-Newman black hole, for R < Rq{9) no physical 
observer can have constant R, 9, (j>. If an observer has a constant R and 9 
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then he must have an angular velocity 

d0 h sin 8 — \fK 

dr ( R 2 + /; 2 )sin 8 — \TKh sin 2 0 

7. An ‘extreme’ Kerr-Newman black hole is defined by the relation G 2 M 2 — 
GQ 2 + h 2 . If G 2 M 2 < GQ 2 + h 2 , there is no horizon (since A = 0 has no real 
roots). Show that, if a black hole is initially in the extreme form, it cannot evolve 
into a state of no horizon under the usual laws of black-hole physics. 



Chapter 14 

The expanding Universe 


14.1 Historical background 

In 1915 Einstein put the finishing touches to the general theory of rel¬ 
ativity. The Schwarzschild solution described in Chapter 9 was the first 
physically significant solution of the field equations of general relativity. 
It showed how spacetime is curved around a spherically symmetric dis¬ 
tribution of matter. The problem solved by Schwarzschild was basically 
a local problem, in the sense that the deviations of spacetime geometry 
from the Minkowski geometry of special relativity gradually dimin¬ 
ish to zero as we move further and further away from the gravitating 
sphere. This result can be easily verified from the Schwarzschild line 
element by letting the radial coordinate go to infinity. In technical jar¬ 
gon a spacetime satisfying this property is called asymptotically flat. 
In general any spacetime geometry generated by a local distribution of 
matter is expected to have this property. Even from Newtonian grav¬ 
ity we expect an analogous result: that the gravitational field of a local 
distribution of matter will die away at a large distance from the dis¬ 
tribution. Can the Universe be approximated by a local distribution of 
matter? 

Einstein rightly felt that the answer to the above question would 
be in the negative. Rather, he expected the Universe to be filled with 
matter, howsoever far we are able to probe it. A Schwarzschild-type 
solution cannot therefore provide the correct spacetime geometry of such 
a distribution of matter. Since we can never get away from gravitating 
matter, the concept of asymptotic flatness must break down. A new type 
of solution was therefore needed to describe a Universe filled everywhere 
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with matter. Einstein published such a solution (cf. Ref. [46]) in 1917. 
It was to launch the subject of theoretical cosmology , that is, a subject 
dealing with theoretical modelling of the Universe in the large. 


14.1.1 The Einstein universe 

It is evident from the field equations of general relativity that their 
solution in the most general form - the solution of an interlinked set of 
non-linear partial differential equations - is beyond the present range of 
techniques available to applied mathematics. It is necessary to impose 
simplifying symmetry assumptions in order to make any progress 
towards a solution. Just as Schwarzschild assumed spherical symmetry 
in his local solution, Einstein assumed homogeneity and isotropy in 
his cosmological problem. He further assumed, like Schwarzschild, that 
spacetime is static. This enabled him to choose a time coordinate t such 
that the line element of spacetime could be described by 

dy 2 = c 2 dt 2 — dx'* dx v , (14.1) 


where a, lv are functions of space coordinates x M (ju., v = 1,2, 3) only. 

Note that the constraint of homogeneity implies that the coefficient 
of d? 2 can only be a constant, which we have normalized to c 2 . We 
may further assume as hitherto that c — 1. Similarly, the condition of 
isotropy tells us that there should be no terms of the form dt dx M in 
the line element. This can be seen easily in the following way. If we 
had terms like go/j dt chT in the line element, then spatial displace¬ 
ments dx M and —dx^ would contribute oppositely to d.s 2 over a small 
time interval dt, and such directional variation would be observable 
and would be inconsistent with isotropy. Can we say anything more 
about a )lv ? 

We go back to Chapter 6, where we discussed spacetime symme¬ 
tries in general. Referring to the maximally symmetric spaces of three 
dimensions, to which the homogeneous and isotropic model of Einstein 
belonged, we can write down the most general line element for such 
spaces, vide Equation (6.31). However, Einstein felt that the matter- 
filled Universe will have a positive curvature, which will make it close 
onto itself. Thus, of the three alternatives for the curvature parameter k, 
he opted for k = 1, and so the line element of his spacetime became 


ds 2 = c 2 dr - S 


dr 2 
1 — r- 


+ r 2 (dd 2 + sin 2 d d0 2 ) 


(14.2) 


Given this line element, it is now straightforward to compute 
Christoffel symbols and the Ricci tensor. The calculation leads to the 
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following non-zero components of the Einstein tensor: 

0 1 3 

R° n --R = - — , (14.3) 

o 2 S 2 

,1 ,1 ,1 1 

R - -R = Rl - -R = R\ - -R =--. (14.4) 

1 2 2 2 3 2 S 2 

To complete the field equations, Einstein used the energy tensor for 

dust derived in (7.20). For dust at rest in the above frame of reference u l 

has only one component, the time component, non-zero. We therefore get 

TtO „ 2 

-*o — A)C > 

7) 1 = T 2 = T 2 = 0. (14.5) 


Thus the two equations (14.3) and (14.4) lead to two independent 
equations: 


3 

S 2 


8nG 


Po. 



(14.6) 


Clearly no sensible solution is possible from these equations, thus sug¬ 
gesting that no static homogeneous isotropic and dense model of the Uni¬ 
verse is possible under the regime of Einstein equations as stated in (8.3). 

This was a setback, for it indicated that either Einstein’s assumptions 
about the Universe (homogeneous, isotropic and static) were wrong or 
that his set of basic equations was incomplete. The option to get out 
of the conundrum that Einstein adopted was the latter. Fie modified his 
field equations, by introducing the so-called ‘k-term’, to the form 


Rik - l^gtkR + kg,k = kT ik . ( 14.7) 

We briefly discussed this modification in Chapter 8. The constant 
k needed for cosmology turns out to be very small, of the order of 
10~ 56 cm~ 2 . It therefore does not affect the observational checks on the 
theory from the Solar-System data. It makes a difference in cosmology, 
however, as we will find in the following chapter. 

The additional term in the field equations now led Einstein to the 
following modified equations for his static model: 


and 



8?rG 


Po 


k -- = 0 . 

s 2 


Fie now did have a sensible solution. He got 


S = 



c 

2 7i Gpo 


(14.8) 


(14.9) 


(14.10) 
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Einstein considered this solution as justifying his conjecture that 
with sufficiently high density it should be possible to ‘close’ the Uni¬ 
verse. See Reference [46] for the Einstein paper on this topic. In (14.10) 
we have the radius S of the Universe as given by the matter density po, 
with the result that the larger the value of po, the smaller the value of 
S. However, if X is a given universal constant like G, both p 0 and S are 
determined in terms of X (as well as G and c). How big is 1? 

In 1917 very little information was available about p 0 , from which 
X could be determined. The value of 

10 26 — 10 27 cm 

quoted in those days is therefore only of historical interest. If we take 
Po ~ 10~ 31 gem -3 as a rough estimate of the mass density in the form 
of galaxies, we get S 10 29 cm and X 10 -58 cm -2 . 

The /.-term introduces a force of repulsion between two bodies that 
increases in proportion to the distance between them. The attractive force 
of gravity decreases with distance, whereas the above force of repulsion 
increases with distance. Therefore at a specific distance the two would 
balance and provide a static universe. Later it turned out that the model 
was unstable and would either collapse or expand to infinity, depending 
on which of these two forces dominated. Theoretical objections like 
this apart, this model did not survive much longer than a decade, for 
observational reasons discussed in the next section. 


Example 14.1.1 Consider a two-body problem in which a small mass m 
moves under the influence of a large mass M. Ignoring the motion of M and 
assuming that m is held at rest at a distance r from M, we have the net force 
on m as 


F = 


GM 


+ Xrc‘ 


For equilibrium F must vanish, thus giving a static distance of separation as 


( GM\ 

'■-UU 


1/3 


(Note that we have introduced c 2 manifestly to preserve the correct dimen¬ 
sionality.) If, however, the small mass were slightly displaced, F will be 
non-zero. Writing the displacement away from M as hr, we get 


5F = 


/ 2 GM 



m 5 r. 


So,if5r > 0,5 F > 0, resulting in m moving further away from M. Likewise, 
if 6 r < 0, 5 F < 0, thus telling us that m will move towards M. These 
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movements are indicative of instability, since in neither case does m return 
to its original position of rest. 

This example gives an intuitive feel for why the Einstein model is 
unstable. 


14.1.2 The de Sitter Universe 

Einstein was initially very satified by this solution, for it reinforced his 
belief that the Universe would essentially be unique in having a definitive 
spacetime geometry determined by the matter distribution. To this end 
his model showed how its radius was determined by its matter density. 

However, his expectation that general relativity can yield only such 
matter-filled spacetimes as solutions of the field equations was proved 
wrong shortly after the publication of his paper in 1917. For, a few 
months later in the same year, W. de Sitter [47] published another solution 
of the field equations (14.7) with the line element given by 


db 2 = 


,/ H A R A \ 


dr 2 - 


d R 2 


('-W 


where H is a constant related to X by 


i? 2 (d# 2 + sin 2 # d <p 2 ), 

(14.11) 



(14.12) 


The remarkable feature of the de Sitter universe is that it is empty. 
Moreover, although the above coordinates give the impression that the 
universe is static, it is possible to find a new set of coordinates ( t,r,6 , </>) 
in terms of which the line element (14.11) takes the manifestly dynamic 
form 


ds 2 = c 2 dr — e 2H, [dr 2 + r 2 (dd 2 + sin 2 # d0 2 )]. (14.13) 

It is easy to verify that test particles with constant values of (r, 9, <p) 
follow timelike goedesics in this model. Thus the proper separation 
between any two particles measured at a given time t increases with time 
as e Ht . That is, these particles are all moving apart from one another. 

However, these particles have no material status. They have no 
masses and they do not influence the geometry of spacetime. In the 
dynamic sense the universe is empty, although in the kinematic sense it 
is expanding. As Eddington once put it, the de Sitter universe has motion 
without matter , in contrast to the Einstein universe, which has matter 
without motion. 
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14.2 The expanding Universe 

Although de Sitter’s universe was of academic interest because it con¬ 
tained no matter, it possessed a feature that turned out to have contact 
with reality, as was discovered a few years later, namely the fact that the 
Universe is expanding. 

In 1929 Edwin Hubble published a paper in the Proceedings of the 
National Academy of Sciences [48] that turned out to be the trend-setter 
of modern cosmology. Just as Einstein’s model marked the beginning 
of theoretical cosmology, so did Hubble’s findings launch observational 
cosmology, the subject dealing with the observational studies of the 
large-scale Universe. 

Hubble’s conclusions were based on a long series of observations that 
had started with V M. Slipher in 1912 and to which several astronomers 
had contributed, including Hubble himself and his coworker Milton 
Humason. These observations typically looked at the spectra of nearby 
nebulae, which were believed to be galaxies of stars in their own right 
just like the Milky Way. Barring very few exceptions, which included the 
great galaxy in Andromeda, the majority of spectra showed absorption 
lines that were shifted to the red end. This is another instance of a class 
of astronomical objects showing redshift. 

Although one may use the relativistic Doppler-shift formula (1.64) 
derived in Chapter 1, because the redshifts are very small (of the order 
of a few parts in a thousand) one may use the simpler Newtonian limit of 
that formula for |u| <<C 1, and write the speed of recession of a galaxy of 
redshift z as 


v = c x z. (14.14) 

Going beyond this result, however, Hubble found that the velocities 
so computed were increasing in proportion to the distances D of the 
galaxies from us. Figure 14.1 is based on Hubble’s early data. 

Hubble’s findings can be written in the following form: 


v = H x D. (14.15) 

Thus, in whichever direction we look, we find galaxies moving radially 
away from us. Does that mean that we are in a special position in the 
Universe? Rather the opposite! If we sit on any other galaxy and observe 
the Universe from there, we would see exactly the same picture: namely 
the other galaxies, including the Milky Way, are receding from us. 

How can we express this phenomenon in the language of general 
relativity? Can we generate models of the Universe that combine de 
Sitter’s notion of expansion with Einstein’s notion of non-emptiness? 
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Distance D (millions of light years) 


This was the challenge to theoretical cosmologists. For, with this large- 
scale radial expansion, it became impossible to maintain the myth of 
a static universe. Models of an expanding universe were needed. The 
Friedmann models to be discussed shortly do just that, and were in fact 
obtained by Alexander Friedmann between 1922 and 1924, seven years 
before Flubble’s data became well known [49]. Later Abbe Lemaitre in 
1927 [50] also independently obtained models similar to Friedmann’s. 
Flowever, until the impact of Flubble’s observations of 1929, these mod¬ 
els remained largely unrecognized. 


14.3 Basic assumptions of cosmology 

Once we decide to generalize from a static to a non-static model of the 
Universe, our task becomes more complicated. Figure 14.2(a) shows a 
spacetime diagram with a swarm of world lines representing particles 
moving in arbitrary ways. There is no order in this picture, and where 
two world lines intersect we have colliding particles. It would indeed 
be very difficult to solve the Einstein field equations for such a mess of 
gravitating matter. Fortunately, the real Universe does not appear to be 
so messy. 

Hubble’s observations indicate that the Universe is (or at least seems 
to be) an orderly structure in which the galaxies, considered as basic 
units, are moving apart from one another in a systematic manner. Thus 
Figure 14.2(b) represents a typical spacetime section of the Universe in 
which the world lines represent the histories of galaxies. These world 
lines, unlike those of Figure 14.2(a), are non-intersecting and form a 
funnel-like structure in which the separation between any two world 
lines is steadily increasing. One may compare Figure 14.2(b) with the 
disciplined march of an army unit, and Figure 14.2(a) with a jostling 
mob after a rowdy football match. 


Fig. 14.1. Hubble's 
redshift-distance relation 
showing that for larger 
distances (D) the galaxies' 
radial velocities (v) are 
proportionately larger. The 
velocity of a galaxy is 
proportional to its redshift. 
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Fig. 14.2. In (a) we have 
world lines with no set 
pattern, intersecting one 
another on some occasions 
and in general describing 
arbitrary motions of particles. 
In (b) we see world lines 
showing systematic motion 
with each spatial point 
identified with a unique 
member of the set. The Weyl 
postulate stipulates that 
large-scale motions of galaxies 
come close to (b). 



t 



14.3.1 Weyl's postulate 

This intuitive picture of regularity is often expressed formally as the Weyl 
postulate, after the early work of the mathematician Hermann Weyl. The 
postulate states that the world lines of galaxies form a 3-bundle of non¬ 
intersecting geodesics orthogonal to a series of spacelike hypersurfaces. 

To appreciate the full significance of Weyl’s postulate, let us try to 
express it in terms of the coordinates and metric of spacetime. Accord¬ 
ingly we use three spacelike coordinates (/x = 1,2, 3) to label a 
typical world line in the 3-bundle of galaxy world lines. Further, let the 
coordinate x° label a typical member of the series of spacelike hyper¬ 
surfaces mentioned above. Thus 


jc° = constant 
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is a typical spacelike hypersurface orthogonal to the typical world line 
given by 


x" = constant. 


Although in practice the galaxies form a discrete set, we can extend 
the discrete set (x") to a continuum by the smooth-fluid approximation. 
This approximation is none other than the widely used device of going 
over from a discrete distribution of particles to a continuum density 
distribution. In this case we can treat the quantities x " as forming a 
continuum along with x° and use them as the four coordinates x' to 
describe space and time. 

It is worth emphasizing the importance of the non-intersecting world 
lines. If two galaxy world lines did intersect, our coordinate system above 
would break down, for we would then have two different values of x" 
specifying the same spacetime point ( the point of intersection). In the 
next chapter we will, however, encounter an exceptional situation in 
which all world lines intersect at one singular point! 

Let the metric in terms of these coordinates be given by the tensor 
gik. What can we assert about this metric tensor on the basis of the Weyl 
postulate? The orthogonality condition tells us that 


go* = 0 . 


(14.16) 


Further, the fact that the line x " = constant is a geodesic tells us that 
the geodesic equations 


dV 

dx 2 


+ 


r' 

1 u 


dx* dx' 
ds ds 


are satisfied for x' = constant, i — 1,2, 3. Therefore 


(14.17) 


T"=0, n= 1,2,3. (14.18) 

From (14.16) and (14.18) we therefore get 

£?=0. M =L2,3. (14.19) 

3x" 

Thus goo depends on x° only. Following the trick used earlier, we can 
therefore replace x° by a suitable function of x° to make goo constant. 
Hence we take, without loss of generality, 


goo = 1 • (14.20) 

The line element therefore becomes 


ds- 2 = (dv 0 ) 2 + g„ v dx" dx-” 
= c 2 d t 2 + g^ v dx" dx”. 


(14.21) 



232 The expanding Universe 


where we have put ct — x°. This time coordinate is called the cosmic 
time. It is easily seen that the spacelike hypersurfaces in Weyl’s pos¬ 
tulate are the surfaces of simultaneity with respect to the cosmic time. 
Moreover, t is the proper time kept by any galaxy. 


Example 14.3.1 Problem. Suppose we retain the homogeneity assumption 
of the Weyl postulate but give up isotropy. This allows go M terms. Show that 
goiM are independent of time (yu, = 1,2, 3). 

Solution. We still have x 11 = constant as a geodesic. So 


d 2 x° n dx' cLW 

-1- T°.-= 0, 

ds 2 ,J dv ds 


d 2 x M „ dx- dx 2 

ds 2 ,J dv ds 


Since x' 1 = constant, dx M /ds = 0, d 2 x M /ds 2 = 0. The above equations 
therefore give 

/dx»\ 2 „ 


Hence = 0 =>• T^oo = g^ 0 rg 0 + = g^ 0 rg 0 = 0. This implies 

dgon/dt = 0 . 


14.3.2 The cosmological principle 

The second important assumption of cosmology is embodied in the 
cosmological principle. This principle states that, at any given cosmic 
time, the Universe is homogeneous and isotropic. In practical terms it 
means that, if you are blindfolded and taken to any part of the Universe, 
then, on the removal of the eye-cover, you would, on the basis of your 
observations, be able neither to say where you are nor to identify the 
direction in which you are looking. 

We have already come across such spaces in Chapter 6, under the 
category of maximally symmetric spaces. As seen in Equation (6.31), 
we are able to write the line element of such spaces in the form 

da 2 = S 2 [ d? , , + r 2 (d6> 2 + sin 2 <? d0 2 )j. (14.22) 

The parameter k = 0, — 1, or +1 and the factor S is spatially constant. 
It could, however, be a function of cosmic time t without affecting any 
of the symmetries above. Thus the most general line element satisfying 
the Weyl postulate and the cosmological principle is given by 

ds 2 = c 2 dt 2 - S 2 {t) \ , + r 2 (dd 2 + sin 2 e d^ 2 )], (14.23) 

where the 3-spaces t = constant are Euclidean, or flat, for k — 0, closed 
with positive curvature for k = +1, and open with negative curvature 
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for k — — 1. For reasons that will become clearer later, the scale factor 
S(t) is often called the expansion factor. 

The line element (14.23) that we have obtained was rigorously 
derived in the 1930s by H. P. Robertson and A. G. Walker, indepen¬ 
dently [51, 52]. It is often referred to as the Robertson-Walker line 
element. 

The Robertson-Walker line element is sometimes expressed in a 
slightly different form with the help of the following radial coordinate 
transformation: 


2 r 

1 + V1 — kr 2 

We then get the line element as 


(14.24) 


ds 2 = c 2 dr 



[d/- 2 + r 2 (dd 2 + sin 2 d d0 2 )]. 


(14.25) 


This line element is manifestly isotropic in r, 9, (p. We will, however, 
continue to use (14.23). 

Notice how the simplifying postulates of cosmology have reduced 
the number of unknowns in the metric tensor from 10 to the single func¬ 
tion S(t) (of only one independent variable) and the discrete parameter 
k that characterize the Robertson-Walker metric. To determine these 
unknowns we need to solve the Einstein field equations, as was done by 
Friedmann and Lemaitre. We will defer this exercise to the following 
chapter. 


14.4 Hubble's law 

Let us first try to understand how the nebular redshift found by Hubble 
and Humason is accounted for by the Robertson-Walker model. We 
begin by recalling that the basic units of Weyl’s postulate are galaxies 
with constant coordinates . We readily identify the with the (r, 9, <p) 
of Robertson-Walker spacetime. Thus each galaxy has a constant set of 
coordinates (r, 9, </>). This coordinate frame is often referred to as the 
cosmological rest frame. As observers we are located in our Galaxy, 
which also has constant (r, 9 , f) coordinates. Without loss of generality 
we can take r — 0 for our Galaxy. Although this assumption suggests that 
we are placing ourselves at the centre of the Universe, it does not confer 
any special status on us. Because of the assumption of homogeneity, 
any galaxy could be chosen to have its radial coordinate r — 0. Our 
particular choice is simply dictated by convenience. 



234 The expanding Universe 


14.4.1 Redshift 


Consider a galaxy Gi at (r \emitting light waves towards us. Let 
us denote by to the present epoch of observation. At what time should 
a light wave have left Gi in order to arrive at r — 0 at the present time 
t — to? To find the answer to this question, we need to know the path 
of the wave from Gi to us. Since light travels along null geodesics, we 
need to calculate the null geodesic from Gi to us. 

From the symmetry of a spacetime we can guess that a null geodesic 
from r = 0 to r = r\ will maintain a constant spatial direction. That is, 
we expect to have 9 — 9\, (p — (p\ all along the null geodesic. This guess 
proves to be correct when we substitute these values into the geodesic 
equations. Accordingly we will assume that only r and t change along 
the null geodesic. Next we recall that a first integral of the null geodesic 
equation is simply dv = 0. For the Robertson-Walker line element this 
gives us 

C * = ± ^- U426) 


Since r decreases as t increases along this null geodesic, we should take 
the minus sign in the above relation. Suppose the null geodesic left Gi 
at time t \. Then we get from the above relation 


r C JL 

A m 


f' dr 

Jo *J\-kr 2 


(14.27) 


Thus, if we know S(t ) and k, we know the answer to our question. 

However, consider what happens to successive wave crests emitted 
by Gi. Suppose the wave crests were emitted at t\ and t\ + At\ and 
received by us at to and to + A to, respectively. Then, similarly to (14.27), 
we have 


S-tQ+AtQ 




cdt 

m 


dr 


V1 — kr 2 


(14.28) 


If S(t) is a slowly varying function, so that it effectively remains 
unchanged over the small intervals A t 0 and At,, we get by subtrac¬ 
tion of (14.27) from (14.28) 


c A?o c 
~S{U9~~S(n) 

that is, 


c Atp _ S(t 0 ) 
c Ati S(ti) 


(14.29) 


It is not difficult to see that the quantity z defined above is the redshift. 
The term c At\ is, evidently, the wavelength A,i measured by an observer 
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at rest in the galaxy Gi, while c Ato is the wavelength ko measured by an 
observer at rest in our Galaxy, since in the Robertson-Walker spacetime 
the cosmic time measures the proper time kept by any galaxy. Thus 
the wavelength of the light wave increases by a fraction z during the 
transmission from Gi to us, provided that S(t 0 ) > S{t\). In other words, 
Hubble’s observations of redshift are explained if we assume S(t ) to 
be an increasing function of time. We will refer to this redshift as the 
cosmological redshift. Let us view it in comparison with the two other 
types of redshifts we have so far encountered. 

Our derivation above shows that the cosmological redshift arises 
from the passage of light through an expanding non-Euclidean space- 
time. Although in the early days of its discovery it was considered 
a manifestation of the Doppler effect, the correct general-relativistic 
treatment shows that it does not arise from the Doppler effect, since 
in our coordinate frame all galaxies have constant (r, 9 , </>) coordinates. 
Further, in a non-Euclidean spacetime it is not possible to attach an 
unambiguous meaning to the relative velocity of two objects separated 
by a great distance. People are often tempted to relate z to velocity by 
the special-relativistic relation 


1 +z = 


1 + v/c 
1 — v/c 


(14.30) 


Such an interpretation is not valid in our present framework because, 
as we saw in Chapter 5, special relativity applies only in a locally flat 
region of spacetime. 

It is also necessary to contrast (14.29) with the gravitational redshift 
described in Chapter 9. The gravitational redshift is characterized by the 
fact that, if light travelling from object B to object A is redshifted, the 
light travelling from A to B is blueshifted. In the present case, if light 
travelling from galaxy A to galaxy B is redshifted, that travelling from 
B to A will also be redshifted, provided that S(t) is increasing during the 
transmission of light. 

In conclusion, we also consider the oft-expressed confusion at the 
existence of objects with redshifts greater than 1. Normally the Doppler 
shift z is interpreted as arising from a source receding from us with the 
speed c x z. How, then, it is asked, is it that we have objects travelling 
faster than light in spite of the light-speed limit imposed by special 
relativity? This way of looking at things is wrong on at least two counts. 
The formula used to compute velocity is Newtonian and needs to be 
replaced if one is applying special relativity. The special-relativistic 
formula above gives |u| < c for all z howsoever large. Secondly, in the 
cosmological case, the correct formula is not (14.30) but (14.29). The 
latter simply tells us that the light from the redshifted object left it when 
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the scale factor was (1 + z) _1 of its present value. In an expanding 
Universe this may well be possible. Since we are dealing with curved 
spacetime, there is no justification in invoking formulae and results from 
special relativity, which does not apply here. 


14.4.2 The velocity-distance relation 

With the framework developed so far, we can derive Hubble’s law for 
low-redshift galaxies. The largest redshift in Hubble’s 1929 paper was 
z = 0.003. At these small redshifts we can use the Taylor expansion to 
derive a simple linear relation for the distance D\ of a galaxy Gi of 
redshift z\ <^\. We define the distance at the present epoch to as 


Hi = riS(to)- 


(14.31) 


We also get, by the Taylor expansion of (14.29), 


dr 


V1 — kr 2 



s(h) 

S(h) 

From these relations we get 


r\. 

(14.32) 

c(t 0 - h) 

(14.33) 

S(h) 

.o 

•Co |C 

1 

o 

1 

to 

(14.34) 



S ( to ^ ~ CY, VI \ 

« S(t 0 )( 1 z). 

1 +Z 

(14.35) 


Hi « nS(t 0 ) ™ c(t 0 - h) 



which can be expressed in the form 


(14.36) 


cz=H 0 D i, (14.37) 

with Hq, the Hubble constant, given by 

*=(!) ■ (1438) 

From a Doppler-shift point of view, cz may be identified with the 
velocity of recession at small z. In this form (14.37) we have Hubble’s 
velocity-distance relation. Expressed as part of the velocity-distance 
relation, the Hubble constant has the unit of velocity per unit distance, 
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the most common unit in usage being kilometres per second per mega¬ 
parsec. 1 In many calculations of observational and physical cosmology 
we shall use 


H 0 = h 0 x 100km s _I Mpc _1 . (14.39) 

Although Hubble originally obtained ho ~ 5.3, the present estimate of h o 
is much lower. It is still uncertain, and, until recently, was believed to lie in 
the range 0.5 < ho < 1. Observations with the Hubble Space Telescope 
(HST) and some ground-based telescopes have narrowed down this 
range, to around 0.55-0.75. Many cosmologists, however, believe that 
ho « 0.7. 

Another useful way of expressing Hq is in units of reciprocal time; 
that is, by expressing 

to = Hq 1 (14.40) 

in units of time. A good time unit for to is the gigayear (Gyr). The present 
estimate of tq is in the range of approximately 13-18 Gyr, depending 
upon the value chosen for Hq. We may refer to To as the Hubble time 
scale. 

14.5 The luminosity distance 

The distance D\ — r\ S(to) we have defined above may be called the 
metric distance. From the Robertson-Walker metric we deduce that this 
is the present radius of the sphere centred on us, on which the galaxy 
is located, the total surface area of the sphere being 4 jt.D[. Using the 
practice prevailing in galactic astronomy, we may be tempted to argue 
that, if L is the luminosity of the galaxy Gi, its apparent flux of radiation 
crossing unit area normally at r — 0 will be simply 


4nD\ 

This conclusion would, however, be wrong in the expanding Universe. 
Let us do the calculation correctly. 

Let L be the total energy emitted by the galaxy Gi in unit time at 
the epoch t\ when light left it in order to reach us at the present epoch 
to- The redshift z of the galaxy is therefore given by (14.29). It is now 
necessary to specify the wavelength range of observation. To fix ideas, 


1 To those unfamiliar with theparsec as a distance unit, we add that it equals 3.0856 x 10 18 
cm or 3.26 light years. This unit naturally arose from the stellar astronomer’s attempts 
to measure distances of stars using the parallax method. 
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Fig. 14.3. The distribution of 
light emitted by galaxy Gi is 
assumed isotropic, i.e., 
distributed uniformly across 
the surface of the sphere 
centred on Gi. 



Sphere 


suppose that the intensity distribution of Gi over wavelengths A is given 
by the normalized function /(A). Thus 

dL = L f(X)dX (14.41) 

is the energy emitted by Gj per unit time over the bandwidth (A, A + dA). 
If instead of wavelengths we wanted to use frequencies, the correspond¬ 
ing intensity function J(v) is related to 7(1) by 

cJ(v) = A 2 /(A). (14.42) 

Both J(y) and /(A) are used by the astronomer, depending on conve¬ 
nience. 

In the case of isotropic light emission by Gi, by the time its light 
reaches us it is distributed uniformly across a sphere of coordinate radius 
r\ centred on Gi (see Figure 14.3). We have already seen that, in the 
Robertson-Walker line element, the area is An D\. We now need to 
know how much light is received per unit time by us across unit proper 
area held perpendicular to the line of sight to Gi, over a bandwidth 
(A 0 , 7.0 + AA 0 ). Denote this quantity by .^(AojAAo. Now two effects 
intervene to make the answer different from that expected from Galactic 
astronomy. 

Note first that because of redshift the arriving light with wavelengths 
in the range (A 0 , A 0 + AA 0 ) left Gi in the wavelength range 

/ Ao Ao + AA.o \ 

\ 1+ z 1+- / 
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Now the total amount of energy that leaves Gi between the epochs t\ 
and t\ + A?i in the above frequency range is 



How many photons carry the above quantity of energy? For a small 
enough bandwidth, we may assume that a typical photon had, at emis¬ 
sion, the wavelength A. 0 /(l + z), a frequency (1 + z)c/X 0 , and hence an 
energy equal to (1 + z)ch/X q, where h is Planck’s constant. Therefore 
the required number of photons is 


SAT = LI 


/ Xo \ A/to 

VI +z) IT: 


A/ 1 


+ z (1 + z)ch/X 0 


1 


LX o 
ch (1+z) 2 


• / 


(it;) 


AA,q A/i . 


At the epoch of reception, these photons are distributed across a surface 
area of Anr\S 2 (to) and are received over a time interval (to, t 0 + A to). 
Thus the number of photons received by us per unit area held normal to 
the line of sight and per unit time is given by 


LX 0 1 /A. o \ ^ Atj 1 

ch (1+z) 2 \l+z/ ° At 0 Anr 2 S 2 (to) 

At this epoch, because of a scaling down of its frequency by redshift, each 
photon has been degraded in energy by the factor (1 + z) _1 . Thus each 
photon now has the energy ch/X 0 . If we multiply the above expression 
by this factor, we get the quantity we were after: 


IF(X 0 )AX 0 — L ———— 

(1+z) 2 



1 

4nrfS 2 (t 0 ) 


AXq. 


However, we note from (14.29) that Aq/At 0 gives us another factor 
(1 + zy x in the denominator. Thus finally we get 


T(Xo) = 


LI(X 0 /l+z) 

(1 + z) 3 47r rfS 2 (to) 


(14.43) 


Thus the two effects coming in because of the expansion of the 
Universe are (1) the reduction by the factor (1 + z) -1 of energy emitted 
per quantum and (2) the time-dilatation at the receiving end by the factor 
(1 + z )- 

In terms of frequencies the result is quoted as flux density, 


S(v„) = 


LJ(Vg(\ + z)) 

(1 + z)4ti rlS 2 (to) 


(14.44) 


Here 5(1+) A Vo is the amount of radiation received perpendicular to unit 
area in unit time across a frequency range (i+, i+ + Avq). 



240 The expanding Universe 


The optical astronomer uses this result in the form (14.43), while the 
radio astronomer uses it in the form (14.44). The X-ray astronomer uses 
energies instead of frequencies, so (14.44) is scaled by h. Astronomers 
have occasion to use these expressions when looking at the various 
observational tests of cosmology. We will end this section by deriving a 
few results of interest to optical astronomy. 

The expression (14.43) integrated over all wavelengths gives 


Ah,)] — 


AnrlS 2 (h)( \ +z) 2 


(14.45) 


where Tboi(=^) * s the absolute bolometric luminosity of G|. iF bol is 
correspondingly the apparent bolometric luminosity of Gi. On the log¬ 
arithmic scale of magnitudes familiar to the optical astronomer, (14.45) 
becomes 


m b oi = -2.5 log , 

M bol = -2.5 log(^) +4.75, (14.46) 

m bo , - M bol = 5 log A - 5, 


where 


To = 2.48 x 10 -5 erg cm -2 s _1 , 

L e = solar luminosity = 4 x 10 33 ergs -1 , (14.47) 

A = r\S(to)( 1 + z). 

D\ is called the luminosity distance of Gi. If we are interested in a magni¬ 
tude defined for a particular waveband around A, say, we may similarly 
use (14.43) in the logarithmic form with the apparent magnitude defined 
by 


m(k 0 ) = —2.5 logJOAo) + constant, 

the constant depending on the filter used to select that waveband. It 
is customary to indicate the filter by a suffix attached to m. Thus w pg 
stands for photographic magnitude, m v for visual magnitude, m h for 
blue magnitude, and so on. 

Note, however, that, when using a specific filter, because of redshift 
the astronomer has to apply a correction to include the effect of the 
term /(A/l + z )- Thus an astronomer using a red filter may actually be 
receiving the photons that originated in the blue part of the spectrum 
of Gi if z « 1. This correction, which is crucial to many cosmological 
observations, is called the K-correction. 
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Example 14.5.1 Problem. Calculate the luminosity distance for the de 
Sitter universe. 

Solution. With the notation used in the text we have 


S(t) = e H \ S(t 0 )/S(h) = e ff(, °-' l) = 1 + z. 

The epochs t 0 and t\ are related by the result (14.27 ) with k = 0. Thus simple 
integration yields 


c 

n = H 


Hence the luminosity distance is 


D i = nS(t 0 )(1 +z) = - e- ff '°)(T +z) 

= ^z(l + z). 


14.6 The Olbers paradox 

In 1826, Heinrich Olbers, a physician from Germany, carried out a 
simple calculation which led to a paradoxical answer. His paradox can 
be phrased as this question: ‘Why is the sky dark at night?’ What the 
Olbers calculation shows is that whether we are facing the Sun or not 
makes no difference: the total radiation received from all stars in the 
Universe is infinite. This paradox and its resolution have cosmological 
implications, so it is appropriate to discuss them here. 

Olbers assumed that (1) the Universe is homogeneous, isotropic 
and static, (2) it is infinite in extent and (3) it is filled with radiating 
objects, each with a constant luminosity. In those days the only geometry 
recognized was Euclid’s, so Olbers did his calculation in its framework. 

Take any point in this Universe as the observing post, denoted by the 
point O. We wish to calculate the total radiation received at O from all the 
stars in the Universe. To this end, divide the Universe into thin concentric 
shells centred at O. The volume of a typical shell of radii R and R + dR is 

4nR 2 dR 

and, if the number density of radiating sources is N, the number of such 
sources in the shell is 

4nR 2 NdR. 

Suppose that each source has luminosity L. At a distance R the 
source would have a radiation flux of 

L 


4j tR 2 ' 
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so the total contribution of all radiating sources in the shell is to generate 
a flux at O of 

, L 

4nR 1 2 dR x N x -- = NL d R. 

4tiR 2 

The total flux from all shells is therefore 


T= / LNdR = oo. (14.48) 

Jo 

This was the conclusion of Olbers’ calculation. Can its drastic nature 
be moderated? For clearly the night sky is dark and not infinitely bright. 

One possible way out was to note that the typical radiator is of a 
finite size, so that beyond a certain distance from O the foreground 
objects would block the radiation of the background population. (In a 
forest thickly populated with trees we see only the foreground trees.) If 
a typical radiator is a ball of radius a, then from a distance R it subtends 
a solid angle at O of 

7ta 2 

so that the total solid angle subtended by the sources in our shell is 

71 “ x 4nR 2 N dR = 4?r x na 2 NdR. 

R 2 

We integrate this expression to a distance D where it equals the total 
solid angle 4?r of the whole sky. Thus, the whole sky will be covered at 
a distance 


D = 


1 

7 va 2 N 


(14.49) 


If we take our integral only up to this distance, we have a finite answer: 


L 

T = NLD -(14.50) 

na 2 

However, the expression we have arrived at is four times the surface 
brightness of the typical source. So, if the typical source is like the Sun, 
the sky should be shining like the solar disc! The above calculation 
supposes that the solid angles from different shells do not overlap. A 
more exact calculation can be done taking into account the overlap, but 
our conclusion is not substantially altered. 

To resolve the paradox we can adopt any of the following arguments. 


1. The Universe is of finite extent. 

2. The Universe is of finite age. If the age is T, say, then we can argue that 
radiation reaching O today could not have come from distances beyond 

c x T. 
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3. Each radiating source lasts only a finite time. This effectively limits the total 
light reaching O. 

4. The Universe is expanding. This is the most dramatic explanation, for it 
invokes the dimming of radiation by the factor (14- z) -2 as found in formula 
(14.45). A calculation using this effect usually yields a finite and low value 
for sky brightness even for infinitely old models. 

We will leave the Olbers paradox here as an example of a simple 
question calling for profound ideas for an answer and turn our attention 
next to actual models of the real Universe. 


Exercises 


1. Taking po = 10 31 gem 3 , calculate the radius of the Einstein universe and 
its total mass in spherical space. 

2. By calculating the 3-volume of space within the coordinate region r = con¬ 
stant in the spaces with the spatial line element 


da 2 = S 2 


- dr 2 
. 1 — hr 2 


4- r 2 (d0 2 + sin 2 # d ijr) 


k = 0. 1, -1. 


develop the three-dimensional analogue of the experiment of covering the sur¬ 
faces of zero, positive and negative curvature by a plane sheet of paper. (In this 
experiment the paper exactly covers a surface of zero curvature; it gets wrinkled 
while covering a surface of positive curvature and it gets torn while covering the 
surface of negative curvature.) 

3. Determine the affine parameter for the radial null geodesic from galaxy Gi 
to the origin r = 0 in Robertson-Walker spacetime. 

4. A particle of mass m is fired today from our galaxy at t = t 0 with a linear 
momentum P 0 . Show that the momentum of the particle when it reaches another 
galaxy at a later epoch t (as measured in the rest frame of that galaxy) is given 
by 


P = 


Po 


S(t 0 ) 

S(t ) ' 


Compare this result with the cosmological redshift for photons. 

5. Take a galaxy Gi at (r,, 9, (p) as a fundamental observer and write u\ as its 
velocity vector in the Robertson-Walker frame. Consider parallel propagation 
of this vector along the null ray connecting the galaxy to the observer O at the 
origin at the present epoch t 0 of observation. Let this vector be v\ at O. This 
represents the radial velocity of Gi relative to the cosmological rest frame at O. 
Use the Doppler effect to work out the redshift for this motion and show that it 
is none other than z as given by the formula (14.29). You can do this exercise for 
the Schwarzschild line element and you can show that the gravitational redshift 
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can also be understood as a Doppler effect for the parallely transported velocity 
vector of the source along the null geodesic to the observer. Try to generalize 
these results. 

6. In a universe with S(t) oc t 2/3 and k = 0, a galaxy is observed to have a 
redshift z = 1.25. How long has light taken to travel from that galaxy to us? 
Express your answer in units of To. 

7. Work out the formula (14.45) for the universe with S oc t 2/3 and k = 0, and 
compare this with the result for the de Sitter universe. In which model is the 
galaxy apparently brighter? 

8. Why is the ‘expanding-universe’ solution preferable as a solution to the 
Olbers paradox, rather than ‘a finite universe’ or ‘a finitely old’ universe? 



Chapter 15 

Friedmann models 


15.1 Introduction 

The work covered in Chapter 14 did not tell us two important items of 
information about the Universe: (1) the rate at which it expands as given 
by the function S(t ); and (2) whether its spatial sections t — constant 
are open or closed as indicated by the parameter k. To find answers to 
these questions, it is necessary to go beyond the Weyl postulate and the 
cosmological principle. We require a dynamical theory that tells us how 
the scale factor and curvature are determined by the matter/radiation 
contents of the universe. 

A comparison of Newton’s law of gravitation with the general the¬ 
ory of relativity shows the latter as enjoying advantages both on the 
theoretical and on the observational front. General relativity gets round 
the criticism of Newtonian gravity of violating the light-speed limit. It 
allows for the permanence of gravitation by identifying its effect with 
the curvature of spacetime. Observationally it performs better vis-a-vis 
the Solar-System tests and explains the shrinking of binaries through 
gravitational radiation. It therefore generates greater confidence than 
Newton’s approach does, especially for use in cosmology, where strong 
gravitational fields are likely to be involved and where distances are so 
large that the assumption of instantaneous action at a distance would 
be misleading. Hence we will adopt general relativity as the underlying 
theory for constructing models of the Universe. 

We will now undertake that exercise by constructing the models 
which Friedmann [49] in 1922-4 and Lemaitre [50] in 1927 came up 
with before Hubble’s results became known. 
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15.2 Setting up the field equations 

We begin with the Robertson-Walker line element: 

ds 2 = c 2 dt 2 — S 2 (t) -— +r 2 (dd 2 + sin 2 dd0 2 ) . (15.1) 

. 1 — kr - 

We use it first to compute the Einstein tensor and thereby formulate the 
general-relativistic field equations. To solve them we will next require 
the energy tensor of the material contents of the Universe. 

Accordingly, we set 

.r 0 = ct, x 1 = r, x 2 = 9, x 2 = cj> (15.2) 

so that the non-zero components of gjk and g lk are 
S 2 

goo = 1. gn = ~ _ kr 2 , g 22 = -S 2 r 2 , g }3 = -S 2 r 2 sin 2 e. 


g°°= 1. 


1 - kr 2 

’ 


S 2 r 2 sin 2 d ’ 




S 2 r 2 sin 9 
J 1 — kr 2 


The non-zero components of are then as follows: 
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From these we get 




S 2 + kc 2 \ 

~ 2 r 


(15.7) 


and hence 


, 1 

1 , 

' S S 2 + kc 2 \ , , 

(15.8) 

Q) 

III 

jas 

1 

II 


2 + , =G\ = G\, 

k S S 2 J 2 3 

G° 0 = * 0 ° - \R = 

_i( 

' S 2 + kc 2 \ 

(15.9) 

c 2 ' 

. 5 2 )' 


We have gone through the details of the calculation to illustrate how 
techniques of general relativity developed in earlier chapters can be 
applied to the problem of cosmology. The reader may check that putting 
S — constant = So and k — +1 gives us the formulae (14.3) and (14.4) 
obtained for the Einstein universe in Chapter 14. As a general com¬ 
ment we remark that, because we have spatial homogeneity, the tensor 
components above (Equations (15.5)—(15.9)) do not contain any space 
coordinates. Further, because of isotropy, we have the three space-space 
components of the Einstein tensor equal. Recalling now the Einstein 
equations, we get from (15.8) and (15.9) the only non-trivial equations 
of the set as 


J . S 2 +kc 2 _ 8nG , _ SnG 2 _ ZnG 3 
S + S 2 c 2 1 c 2 2 c 2 3 

S 2 + kc 2 8 n G 0 

^S 2 = ~3c? °' 

We next consider the energy tensor. 


(15.10) 

(15.11) 


15.3 Energy tensors of the Universe 

Before we consider specific forms of Tj , it is worth noting that two prop¬ 
erties must be satisfied by any energy tensor in the present framework 
of cosmology. The first is obvious from (15.10): 

Tj = T 2 = T 2 = -p (15.12) 

(say). The fact that these three components of Tj are equal is hardly 
surprising since we have already emphasized the condition of isotropy 
imposed on the Universe. In the light of our discussion of Chapter 7, 
we identify the quantity p with pressure. We further define the energy 
density by 

Tq = e. (15.13) 

The second property is not quite so obvious, but is derivable from 
(15.10) and (15.11). It relates the pressure to the energy density. We 
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note that if we differentiate (15.11) with respect to t we can express the 
resulting answer as a linear combination of (15.10) and (15.11). The 
result is in fact equivalent to the following identity: 

^ [S(5 2 + kc 2 )\ = S [255’ + S 2 + kc 2 ] , 

that is, 

^( e S 3 ) + 3pS 2 = 0. (15.14) 

d5 

It is not necessary, however, to write down the full field equations (15.10) 
and (15.11) in order to arrive at (15.14). The above result is a direct 
consequence of the conservation law implicit in the Einstein equations: 

Ti, = 0. (15.15) 

Recall that, from the Bianchi identities, (15.15) follows identically. We 
now turn our attention to the specific forms of the energy tensor. 

15.3.1 Pressure and random motion of galaxies 

We have assumed via the Weyl postulate that the primary unit of the 
Universe is a galaxy, which may be treated as a point particle. The 
characteristic length scale of the Universe, now taken as c/Hq, where 
Hq is the Hubble constant at present, works out at ~10 28 cm, compared 
with which the galactic length scale is ~ 30 kpc ~ 10 23 cm. Thus the 
galaxy in the Universe is like a bead of diameter 1 cm in a field of size 
1 km. We may ideally visualize the ‘cosmological fluid’ as made of these 
beads, flowing smoothly with negligible pressure. 

The galaxies ideally should follow the Weyl postulate: in reality 
they do so at best approximately. Random motions of galaxies in clus¬ 
ters, typically of the order of v ~ 300 km s _1 , provide pressure to the 
cosmological fluid of the order of ~ pv 2 , p being the density of the fluid. 
That is, the ratio p/p is as low as 10~ 6 . So when we write the energy 
tensor as 

T ih = (p + p)u‘u k - pg ik (15.16) 

we may justifiably approximate u l by [1, w M ], and ignore \u ll \ in com¬ 
parison with unity. But when is this approximation valid? 


Example 15.3.1 Problem. How does random motion behave in an expand¬ 
ing Universe? 

Solution. Let us take the velocity vector of a typical galaxy as [1, w'*], where 
I 1. Since the galaxy follows a geodesic, we get 

did' dr' Ax k 

~d7 + ik d7 ~d7 = 
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Here we retain only the first-order terms. Using the Christoffel symbols of 
Section 15.2 we get, for /j. = 1, say 

dr/ 1 . . 

— + 2 ^ 1 /' = 0 , 

i.e., 

d// 1 S . , , 

-h 2 — u = 0 =>■ u S = constant. 

dr S 

The same applies to /r = 2, 3. Since we are using comoving coordinates, the 
physical motion v 11 = S'r/'T So we get v^S = constant. Thus, as the Universe 
expands, the random motion decreases as S ~ l . 


The solved example above shows that, in the Robertson-Walker 
spacetime, the random velocity v fl varies as 1 /S. Hence, in an expand¬ 
ing Universe, the pressure was more important in the past, is not so 
important now and will be even less important in the future. As we 
turn towards the past epoch, we should find the galaxy motions becom¬ 
ing more and more turbulent, since v was larger in the past. Thus, if 
we use S ~ 10~ 3 5o (So being the value of S at the present epoch), 
the p -term would no longer be negligible in this epoch and prior 
to it. 

For such past epochs we have to abandon our simplified picture of 
cosmology and ask whether galaxies existed as single units then. This 
question leads us to cosmogony, the subject of the origin of the large- 
scale structure of the Universe. Obviously, galaxies were formed at some 
stage in the past and, in a proper theory of cosmology and cosmogony, 
we have to say how and when they were formed. This topic, however, 
does not fall within the ambit of this text. The reader is referred to [53], 
which is the companion text to this one. 

Returning to our discussion of energy tensors, we see that, if we 
simply extrapolate t> oc S -1 to very low values of S, v becomes compa¬ 
rable to c and our approximation that led us to v oc S -1 breaks down. 
The correct formula then tells us that the 3-momentum P goes as S _1 . 
In this relativistic domain galaxies have not yet formed and matter is in 
the form of atomic particles moving very rapidly. Thus we have to use 
the formula (7.22), and we set 

P=\e, (15.17) 

where e denotes the energy density of these fast-moving particles. Thus, 
we may look upon a typical volume of these early epochs as containing 
matter particles moving at random relativistically, but any such spherical 
volume would have a centre of mass of all these particles at rest in the 
Robertson-Walker frame. In this case the Weyl postulate is not satisfied 
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for a typical particle, but it may still be applied to the centre of mass of 
a typical spherical volume. 


15.3.2 Matter versus radiation domination 

So, if S continues to increase from very small values, then (15.17) would 
hold for the early epochs, just as (15.16), with p 0, holds in the present 

and relatively recent epochs. The transition between the two epochs was 
through a rather messy phase when neither (15.16) nor (15.17) applied. 
If (15.16) holds, then from (15.14) we get, with p = 0, 

d d s >5 3 ) = 0, (15.18) 

which integrates to 

P = Po Jj, (15.19) 

Po and So being the values of p and S in the present epoch. 

Similarly, substitution of (15.17) into (15.14) leads to 

^(eS 4 ) = 0, (15.20) 

giving 

eoc S'- 4 . (15.21) 

We therefore have the following picture. For a distribution of matter 
(15.21) was applicable when S was very small compared with So, while 
(15.19) holds in the more recent epochs. If, however, on top of matter 
we also have electromagnetic radiation present in the Universe, it too 
will contribute to T l k . For small S, (15.21) holds uniformly for matter 
(moving relativistically) and for radiation. However, as S increases we 
have to be more careful in distinguishing between the contributions of 
matter and radiation to 7 k . For, as we shall see later, while matter and 
radiation were in close interaction at small S, at later epochs they became 
effectively decoupled from each other. We will go into these details more 
fully in Chapter 16. 

For the present discussion let us assume that, after a certain epoch 
t — t dec when S was given by S — Sdec, radiation and matter decoupled 
from each other, each going its own way. Thus we can write 


T i = T i -I- T l 

1 k 1 k | matter ' 1 k | radiation 


(15.22) 


and assume that the divergence of each energy tensor separately vanishes. 
Since for the radiation energy tensor we have (for /x = 1,2, 3), say, 


_ 1 rj, 0 _ 

1 pi | radiation 3 1 0 | radiation 


(no sum over p), 


(15.23) 
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we get for .S' > So 

ec 


e 



(15.24) 


What is fdec? Why, if at all, should matter decouple from radiation? 
What happened prior tot = Hec? We defer a discussion of these questions 
to Chapter 16. There was, however, another important epoch in the past 
history of the Universe, when the densities of matter and radiation were 
equal. We will denote it by t = f eq , when S was equal to 5 eq , say. It is 
easy to estimate this scale as follows. 

The present estimates of eo ^ 4 x 10~ 13 ergcm~ 3 and of poc 2 > 
3 x ICC 10 erg cm -3 mean that the matter density is about 10 3 times 
the radiation density. Thus eo 4C poc 2 , and we may ignore the con¬ 
tribution of radiation (in comparison with the contribution of matter) 
to the field equations (15.10) and (15.11) at the present epoch, and for 
S > ,S'o. However, for the past epochs with S < So, we have from (15.19) 
and (15.21) 


€ _ fo So 

pc 2 PoC 2 S ' 


(15.25) 


and we cannot ignore the contribution of radiation for, say, So/S ~ 10 3 . 
This is the epoch t eq . Indeed, prior to this epoch, that is for S < S eq , the 
relative importance of radiation and matter was inverted: radiation was 
the more dominant factor in deciding how S should vary with t. 

From the above discussion we see that, at s — S eq ^ 10 3 So, we 
have a transition from a radiation-dominated Universe to a matter- 
dominated one. Here we will limit ourselves to the matter-dominated 
models with negligible pressure, leaving the discussion of the radiation- 
dominated models to the following chapter. Equations (15.10) and 
(15.11) are therefore to be solved with 


T}= 0, r 0 ° = p 0 c 2 |j. (15.26) 

This simplification leads us to the classic models first considered by A. 
Friedmann in 1922. Basically, these models ignore any contributions 
of electromagnetic radiation to Tj. and suppose that the matter in the 
Universe can be approximated by dust. 

We also mention, in passing, that the above analysis ignores the 
contribution of dark matter. A realistic assessment of dark matter will 
push up the value of po by a factor ~6-7. 
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15.4 The Friedmann models 

We will assume that the Universe is (as at present) dust-dominated. For 
dust models, Equations (15.10) and (15.11) become 


S S 2 + kc 2 
2 5 + 52 


= 0 , 


S 2 + kc 2 87r Gpo Sq 

S 2 = 3 ¥' 


(15.27) 

(15.28) 


In view of the conservation law given in (15.14), the above two 
differential equations are not independent, and only one of them is suffi¬ 
cient to determine S(t). Since it is of lower order, we will choose (15.28) 
for our solution, and consider the three cases k — 0, 1, — 1 separately. 


15.4.1 Euclidean sections (k = 0) 

This is the simplest case and is also known as the Einstein-de Sitter 
model, since it was given by Einstein and de Sitter in a joint paper [54] 
in 1932. Equation (15.28) becomes 


£ 2 _ 8jtGpq 5q 
3 S' 


(15.29) 


We now recall from Chapter 14 that the present value of Hubble’s con¬ 
stant is given by 


S 

s 


<0 


= Ho. 


(15.30) 


Hence, on applying (15.29) to the present epoch, we get 

P0=^|=Pc. (15.31, 

For reasons that will become clear later, p c is often called the closure 
density. With the range of values of H 0 quoted in Chapter 14, we have 


p c = 2 x 10 29 hi gcm 3 . 


(15.32) 


The value as estimated by using the current favourite value of ho ~ 0.7 is 
considerably higher than the matter density actually observed at present. 
We will return to this issue in Chapter 16. 

Returning to (15.29), it is easy to verify that it has the solution 
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Fig. 15.1 . The scale factor of 
the Einstein-de Sitter model 
(k = 0 in the case of the 
simplest Friedmann models). 


An arbitrary constant that arises from the integration of the differential 
equation can be set equal to zero by assuming that S = 0 at t — 0. We 
also get from Equation (15.30) the age of the Universe as the present 
value of t: 



(15.34) 


The constant So, the value of the scale factor at the present epoch, is not 
determined. It has the dimensions of length, and it can be absorbed into 
the unit of length chosen. Figure 15.1 illustrates this solution. 


1 5.4.2 Closed sections (k — 1) 

Equations (15.10) and (15.11) now take the form 

(15.35) 

(15.36) 

It is convenient to introduce the quantities q(t) and H(t) through the 
relations 


„ 5 S 2 + c 2 

2 s + s 2 


S 2 +c 2 

S 2 


SnGpoSg 

3 J 2 


= 0 . 


= 0 . 


S - = -q{t)[H(t)} 2 , H(t) = S s , (15.37) 

with their present values denoted by qo and Hq. We have already come 
across Hq, the Hubble constant. The second parameter qo is called the 
deceleration parameter , and it is useful for expressing po in terms of the 
closure density. 
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With the above definitions, (15.35) and (15.36) take the following 
forms when applied at the present epoch: 

„2 


la =(2<?o-l)W 0 , 


Po = 


8ttG 




4t rG 


go- 


The density po is often expressed in the following form: 

Po — Pc^O- 


(15.38) 

(15.39) 

(15.40) 


so that from (15.38), (15.39) and (15.40) we get the density parameter 


£2o = 2go. 


(15.41) 


Since the left-hand side of (15.38) is positive, we must have 

go > > 1. (15.42) 

Thus our closed model has density exceeding the so-called closure 
density p c . This explains the name ‘closure density’. It is the value of 
the universal density that must be exceeded if the model is to describe 
a closed universe. We mention at this stage the result (to be proved 
shortly) that for the open models (k — — 1) the inequalities of (15.42) 
are reversed. 

Using (15.38) and (15.39) to eliminate So and po, we get the follow¬ 
ing differential equation: 


with a given by 



2 go c 

(29„- l) 3 / 2 iV 


(15.43) 


(15.44) 


The parameter a has the dimensions of length. Thus the model is char¬ 
acterized by the parameters Hq and go (or, alternatively, £2o)- 
Equation (15.43) can be integrated as follows. We get 


ct = 


VSdS 
\Ja — S 


Make the substitution in terms of an auxiliary variable ©: 



Then the integral becomes 



cos©). 


ct = 




sin©). 


(15.45) 


(15.46) 



15.4 The Friedmann models 255 


Again, as in the case k = 0 we have taken S = 0 at / = 0 (© = 0). We 
therefore get t = to by requiring that S = So- From (15.38) and (15.44) 
we see that S — So at © = ©o, where 


1 c ,,, (2o 0 — 1) 

S 0 = -a( 1 - cos0 o ) = —(2q 0 - 1) [/ = —-- a, 

2 H 0 2qo 


that is. 


cos ©o = 


1 - go 
<7o 


sin ©o = 


f2q 0 - 1 


qo 


We therefore get the age of the Universe as 


to = —(©0 - sin © 0 ) 
2c 


q o 


l ( 1 ~ go 
qo 


2qo - 1 
qo 


(2qo - 1 ) 3/2 
For example, for = 1 we get 

Note that S reaches a maximum value at 0 = n, when 

2q 0 c 


S — S m - dx — a — 


(2qo ~ l) 3 / 2 H 0 ' 


(15.47) 


—. (15.48) 


(15.49) 


(15.50) 


Thus, for qo = 1, the Universe expands to twice its present size. 

In closed models, therefore, expansion is followed by contraction 
and S decreases to zero. The value S — 0 is reached when © = 2 jt; that 
is, when 


Tta 2nqo 1 

V = (2qo — 1) 3/2 ~Hq' 


(15.51) 


The quantity t L may be termed the lifespan of this universe. For qo — 
1, Fl = 2nHf x = 2:r r a . Recall that to is defined by the relation (14.^40). 

Figure 15.2 illustrates the function S(t) for the closed models for a 
number of parameter values go- All curves have been adjusted to have 
the same value of Ho at point P. Notice that the value S — 0 is reached 
sooner in the past as qo is increased from just over 1 /2. 


1 5.4.3 Open sections (k = -1) 

Equations (15.10) and (15.11) become in this case 


J S 2 — c 2 
2 s + S 2 


S 2 -c 2 

S 2 


8nGp 0 So 
3 S 2 


= 0, 

(15.52) 

= 0. 

(15.53) 
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Fig. 15.2. The scale factors 
for closed (k = +1) Friedmann 
models are shown in the 
diagram. The life span of the 
Universe gets smaller for 
larger values of the 
deceleration parameter go- 



We again use the definitions of (15.37) and apply them at the present 
epoch to get 


C 2 

^0 


= (1 - 2 g 0 )W 0 2 


Po = 


3Hl 

4tcG 


qa, 


£2o — 2go- 


Thus instead of (15.42) we now have 


(15.54) 

(15.55) 


1 

0 < qo < 0 < Q 0 < 1. 

and in place of (15.43) we get 



with 


P = 


2q 0 c 
(1-2 g 0 ) 3/2 ^' 


(15.56) 


(15.57) 


(15.58) 


As in the k = +1 case, the solution of (15.57) may be expressed by a 
parameter with 


S — ^/ficosh 4r — 1), ct = - / 6(sinh'J' - 4/). (15.59) 

The present value of T is given by 4 / o, where 


cosh 4r 0 = 


1 - go 

qo 


sinh'I'o = 


</l - 2g 0 


q o 


(15.60) 
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C 


B 


We have set t = 0 at S — 0, as in the two preceding cases. The present 
value of t is given by 




(15.61) 


Like the Einstein-de Sitter model, these models continue to 
expand forever. The behaviour of S(t) in these models is illustrated in 
Figure 15.3. 

15.4.4 The Milne model 

It is worth pointing out that the model with k = — 1, qo = 0, S(t ) = ct 
represents flat spacetime. In fact, by the following coordinate trans¬ 
formation we can change the line element to a manifestly Minkowski 
form: 



This model arose naturally in Milne’s kinematic relativity [55], which 
was a cosmological theory with foundations different from those of 
general relativity. For this reason the above model is sometimes referred 
to as the Milne model. 

For a comparison, the three types of Friedmann models (k — 0. ± I) 
are shown together on the same plot in Figure 15.4. There is a unique 


Fig. 15.3. Three cases of the 
temporal behaviour of the 
scale factor 5(f) for the open 
Friedmann model 
corresponding to go = 0, 0.1 
and 0.2. The age of the 
Universe is larger for smaller 
go- As shown in the figure, the 
largest age is 1 / Ho, 
corresponding to go = 0. 
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Fig. 15.4. All three types of 
Friedmann models shown 
together. All share the 
property of a singular origin, 
as discussed in the text. 


C 



‘flat’ model (the k = 0 case is often so described; but this can be mis¬ 
leading insofar as the ‘flatness’ refers to the spatial sections t — constant, 
not to spacetime as a whole), but a continuous range of k = ±1 models, 
of which two representative ones are shown. The dots P, Q, R lie on a 
typical curve H{t) = constant. Thus, for the same Hubble constant, the 
open models give a larger age. 

Figure 15.4 shows how all the Friedmann models have the common 
feature of having S = 0 at a certain epoch (which we designate by 
t — 0). As we approach 5 = 0, the Hubble constant increases rapidly, 
becoming infinite at S — 0, except in the special case of the Milne model 
k = — 1, c/o = 0. This epoch therefore indicates violent activity and is 
given the name big bang. It was Fred Hoyle who in the late 1940s gave 
this name, largely in a sarcastic vein, as he was, and continued to be, 
critical of the big-bang concept. We will discuss the reasons for this in 
extenso in later chapters. For the time being we simply state that the 
name has stuck and has been accepted by a large majority of workers in 
cosmology. We also mention that the big bang is the singular state where 
all the geodesics of the Weyl postulate meet. 


15.4.5 Luminosity distance 

A practical result we need is the luminosity distance described in Chapter 
14, for it tells us the effective distance of a source at a given redshift, 
the distance whose square we divide by in order to estimate the flux of 
radiation received from the source normal to a unit area at our location. 
If the light left the source at time t\ to reach us at time to, its travel 
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formula tells us that, for the k — +1 model, 


dr 


/o Vl - 1 


i dr 

, wr 


(15.63) 


where the source galaxy is located at r = r\. Using formulae (15.45) 
and (15.46) we get 


dr = -a[l - cos ©]d© = 5(r)d@, 
so that the above equation yields the simple solution 


r i = sin (@ 0 - ©i). 


(15.64) 


Here we have identified the epoch to with the receiver and t\ with the 
source. Using the fact that the redshift of the source is z, formulae 
(14.29) and (15.45) give 


which gives 


1 — cos ©o 
1 — cos ©i ’ 


cos ©i 


Z + COS ©0 
1 + z 


(15.65) 


On putting together the values of cos ©i and sin ©i in Equation (15.64) 
and using the corresponding values of the trigonometric functions of ©o 
from Equations (15.47), we get 


n = 


\/ 2 ?o - 1 [<7oz + (?o - 1)(\/1 + 2 -?o - U] 
?o(! + z ) 


The luminosity distance is therefore given by 


(15.66) 


A — r 1 5’o( I + z) 

- (^) ~2 [?° z + (? 0 “ W 1 + 2z< l0 - !)] • (15.67) 

This formula was first derived by Mattig [56] in 1958. 

The formula for the open universe can be similarly derived and leads 
to exactly the same final answer. The case of the Einstein-de Sitter model 
can be obtained by letting qo tend to 1/2. Of course it is much simpler 
to derive the result directly from the original formulae for that model. 
We give the final answer for this case below: 

A = 77 - [(! + z ) - (1 + z ) 1/2 ] • (15.68) 

no 

Although it is good to be able to derive these formulae analytically, 
the facility of the computers has made the exercise rather unnecessary. 
Nevertheless, the above exercise is useful in clarifying the roles of 
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Fig. 15.5. The luminosity 
function D as a function of 
redshift plotted for various 
values of qo for the simple 
Friedmann models (with 
A = 0). 


various expressions in cosmology, which otherwise remain hidden in 
a computer programme. Figure 15.5 shows how D varies with z for 
different q 0 . 


15.5 The angular-size-redshift relation 

We will next consider an unusual effect that arises because of the 
non-Euclidean geometry of the typical Robertson-Walker spacetime. In 
Figure 15.6 we have a spherical galaxy of diameter d at redshift 1 + z. 
How will its angular size Ad depend on redshift? If we associate the 


Fig. 15.6. The angular size of 
an extended source is the 
angle it subtends at the 
observer O as shown in the 
figure. 


B(6 1 + A$i, <t>-\) 



A($i, <P i) 
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redshift with distance, then we expect distant galaxies to look smaller, 
i.e., to have progressively smaller Ad. 

To decide the answer to this question, consider two neighbouring 
null geodesics (representing light rays) from the two points A and B at 
the two extremities of Gj directed towards our Solar System. Without 
loss of generality we can choose our angular coordinates such that A 
has the coordinates ), while B has the coordinates (d| + Ad|, ). 

(Although we have used homogeneity to take r = 0 at our location, we 
can also use isotropy to choose any particular direction as the polar 
axis 6 = 0, 0 — 7r.) 

According to the Robertson-Walker line element, the proper dis¬ 
tance between A and B is obtained by putting t — t\ — constant, 
r — r\ = constant, <p — <p\ — constant and dd — Adi in (15.1). We 
then get 


ds’ 2 = — r*S 2 (t l )(A8 l ) 2 = — d 1 , 


since in the rest frame of Gi the spacelike separation AB = d. Thus 


d d( 1 + z) 

r\S(h) riS(t 0 ) 


(15.69) 


gives the answer to our question. 

Notice that as r\ increases we are looking at more and more remote 
galaxies, which must therefore be seen at earlier and earlier epochs ?i. 
However, in an expanding universe S(t\) was smaller at earlier epochs 
t \, so it is not obvious that r\S(t \) should get progressively larger as we 
look at more and more remote galaxies. In some cases, therefore, distant 
objects may look bigger. The effect can be ascribed to ‘gravitational 
bending or lensing’ of light as it passes through curved spacetime. (We 
briefly discussed gravitational lensing in Chapter 12.) Clearly, we need to 
know how fast S(/T) decreases as r\ increases. Although (15.69) provides 
the answer in an implicit form, we still need to know S(t) in order to be 
able to perform these integrations. 

Let us take the different Friedmann models in this context. It is easy 
to derive this result for go = 1 /2. From (15.68) we get 


dgo (1+z) 2 / 2 
2c (1 +z)'/ 2 - 1 


(15.70) 


Straightforward differentiation gives us the result that the minimum 
value of Adi (=d m in, say) and the redshift z = z m at which it occurs are 
given by 
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and 


z m = 1.25. (15.71) 

The cases qo ^ 1/2 are more involved. We illustrate the case 
qo > 1/2. Instead of using D\ as given by (15.67), it is more conve¬ 
nient to use the parameter © introduced in (15.45) and (15.46) and the 
relations (15.47). We then get 

Adi = = — [(1 -cosGOsinieo-Qi)]- 1 . (15.72) 

nS(t i) a 

The constant a is defined by Equation (15.44). Differentiation with 
respect to ©i tells us that the minimum occurs when 

sin©] sin(© 0 — ©i) — (1 — cos©i)cos(© 0 — ©i) = 0, 


that is, 


. ( 3©i\ 

sin^© 0 -— J = 0, 


thus giving 


©i = 


2©o 


1 + — 


1 — COS ©0 

1 — cos(2© 0 /3) 


Using (15.69) we get 

^ (29o - 1) 3/2 


dH a 


q o 


(>—(^)W^) « 

The corresponding result for q o < 1/2 is 

(1 - 2q 0 ) 3/2 1 dH 0 


©min — 


q o 


( cosh (2|o) - ^smhl^o) 


at the redshift z m given by 


1 + — 


cosh 'I'o — 1 


cosh 


m 


~ i 


(15.73) 


(15.74) 


(15.75) 


(15.76) 


Figure 15.7 plots A6\ as a function of z for different Friedmann models. 
Notice how the curves all start with the near-Euclidean result A6\ oc z -1 
and then begin to differ from one another at larger z values. In principle 
this effect might be used to decide which Friedmann model (if any!) 
comes closest to the actual Universe. 
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This non-Euclidean effect was first pointed out by R. C. Tolman [57] 
and a way of using it as a cosmological test was first suggested by Fred 
Hoyle [58], 

15.6 Horizons and the Hubble radius 

In cosmological discussions two kinds of horizons often crop up: the 
particle horizon relates to limits on communication in the past, whereas 
the event horizon relates to limits on communication in the future. We 
will deal with these two concepts in that order. 

15.6.1 The particle horizon 

It is pertinent to ask the following question. What is the limit on the 
proper distance up to which we are able to see sources of light? This 
question is answered as follows. Going back to Equation (14.27) of the 
preceding chapter, we may have a situation wherein the integral on the 
left-hand side has a maximum value at the given epoch tq. This therefore 
gives a maximum value rp for the radial coordinate r\ . For a galaxy with 
/'i > rp there is no communication with us in the above fashion. 

First we calculate this limiting value rp of r\ , which for the Fried¬ 
mann models comes from setting the lower limit for the f-integral at 
zero. The corresponding limiting proper distance is 


Fig. 15.7. The 

'non-Euclidean' behaviour of 
angular size at large redshifts. 
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Fig. 15.8. The particle 
horizon of P is contained in 
the past null cone (dotted 
lines) from P. Thus the particle 
B can causally affect P, but not 
particle A, which lies outside 
the particle horizon of P. 



It is then easy to verify that, for the various Friedmann models, 


2 (k = 0, g„ = i) 


Rr = — x< 
-tj-0 


\/2q a - 1 
2 

. s! 1 — 2go 


sinh 


2go - 1 

2q 0 


i 1 1 - 2qo 
2q 0 


(k = 1, go > f) 

(k = -1, g 0 < \). 

(15.77) 


The existence of a finite value of Rp means that the Universe has a 
particle horizon. Particles with 5(/o)''i > Rp are not visible to us at 
present, no matter how good our techniques of observation are. 

Consider, for example, the Einstein-de Sitter model. The result 
(15.77) gives, in this case, Rp = 2c/ Hq. This means that at present 
we are able to see only those galaxies whose proper distance from us 
happens to be less than 2c/ Hq. See Figure 15.8. 


15.6.2 The event horizon 

The particle horizon sets a limit to communications from the past. Let 
us now see how the event horizon sets a limit on communications to the 
future. Let us ask the following question. A light source at r — r \, t = to 
sends a light signal to an observer at r = 0. Will the signal ever reach 
its destination? Suppose it does and let t\ be the time of arrival. Then 
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from (14.27) we get 

r - r ^ 

Jt 0 S (0 Jo V1 - kr 2 

This relation determines t\ for any given r \, provided that the integral 
on the left is large enough to match that on the right. Now it may happen 
that as t\ —> oo the integral on the left converges to a finite value that 
corresponds to a value of the integral on the right for r\ — r E , say. In 
that case it is not possible to satisfy the above relation for r\ > r E . In 
other words the signal from the light source at r\ > r E will never reach 
the observer at ro. Thus no two observers can communicate beyond a 
proper distance 

R e = So [ — (15.78) 

J , 0 S(f) 

at t = tg. 

This limit is called the event horizon. It does not exist for Friedmann 
models but has the value c/Ho for the de Sitter model, as can be seen in 
the following calculation. 


Example 15.6.1 Consider the de Sitter model described in Chapter 14. 
Here we have k — 0 and S = e Hl . Then we get 

/>00 

R e = s h, ° / ce~ H, dt=—. 

A, H o 

That is, if any light source sends a ray of light from beyond this range 
at time t 0 towards the observer at r = 0, it will never reach the observer. See 
Figure 15.9. The reader will immediately notice the similarity of the event 
horizon here to that for a black hole (see Chapter 13). 


Notice that both the event horizon and the particle horizon have 
radii comparable to c/H 0 , which has led to an erroneous conclusion that 
the length i? H = c/Hq is of the size of the horizon in any cosmology. 
Whether a horizon (particle or event) exists in a cosmological model 
depends on the scale factor and how the relevant integral (discussed 
above) behaves. Thus there are cosmological models that do not have any 
horizon, and for such models the above length does not have any ‘signal- 
limiting’ significance. In such cases, it is best to call this length R H , the 
Hubble radius. The Flubble radius as defined here tells us only the char¬ 
acteristic distance scale of the Universe at t — to; it does not have any 
causal significance unless it is shown to have horizon properties. We may 
compare it with the Hubble time scale tq defined in the previous chapter. 
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Fig. 15.9. Light rays from 
sources within the event 
horizon (shown by thick 
vertical lines at a distance 
c/Ho) will reach the observer 
P. Light rays from sources 
outside will never reach P. 


15.7 Source counts 

The distribution of discrete luminous sources out to great distances may 
give indications that spacetime geometry is non-Euclidean. How does 
the number of galaxies up to coordinate distance r\ (that is, up to the 
distance of galaxy Gi) increase with r\ ? Let us suppose that at any epoch 
t there are n(t ) galaxies in a unit comoving coordinate volume (using 
the r,0,(p coordinates). The word ‘comoving’ indicates that, although 
the galaxies individually retain the same coordinates (r, 9, </>), the proper 
separation between them at any epoch increases with epoch according 
to the scale factor S(t\). Thus the proper volume of any region bounded 
by such galaxies increases as S 3 . 
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When we observe galaxies at radial coordinates between r and r + 
dr, we see them at times in the range t,t + d t, where, from (14.27), 



d/ 

\/l — kr' 2 


The number of galaxies seen in this shell is therefore 


(15.79) 


4jt r 2 dr 

dN=-==-n(t), (15.80) 

V1 — kr 2 

where t is related to r through (15.79). Thus the required number of 
galaxies out to r = r\ is given by 


N(n) 


4nr 2 n(t)dr 
V1 — kr 2 


(15.81) 


If no galaxies are created or destroyed between r = 0 and r = r 1; we 
may take n{t) — constant, and the integral can be explicitly evaluated. 
Clearly, the answer must depend on the parameter k. If we draw a 
sphere whose surface lies at a proper distance R from the centre in the 
k — 0 (Euclidean) space, its volume will be 4jtR 3 /3. However, a similar 
sphere drawn in the k = +1 (closed) space will have a volume less than 
4jtR 3 /3, whereas a sphere drawn in the k — — 1 (open) space will have 
a volume exceeding this value. 

We now apply the above formula to Friedmann models. It is more 
convenient to use redshift as the distance parameter instead of r or t. As 
an example, we will work with the case k = +1. From (15.64) and the 
relations that follow it, we have 


r = sin(@o — ©i), 


dr 




= IdQil, 


1 + Z — 


sin 2 

(?) 


dz 


1 + z 


= cot 


|d©i| = 


1 + 2q 0 z 
2q 0 - 1 


|d©i|. 


Therefore the number of astronomical sources with redshifts in the 
range (z, z + cb) is given by 

Id©! 


d N — 4n skr(© 0 — ©i) • n(t) ■ 


dz 


cb. 


Let us suppose that n(t ) is specified as a function n(z) ofz. Using (15.65) 
and some algebraic manipulation, we get 

aat A f - !) 3/2 [?oz + (?o - 1)(V1 + 2z?o - l)] 2 dz 

dN = An ■ n(z) - ? -—- . (15.82) 

<7o VI + 2q 0 z( 1 + z) 3 


Vl T 2q, ) z( I + z) 3 
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Suppose n(z ) is expressed in a slightly different form. We recall that 
n was specified as the number of sources per unit coordinate volume, 
in terms of the comoving (r, 6, <p) coordinates. What is the relationship 
between n and the number of sources per unit proper volume? Denoting 
the latter by h, we have 

, h Si 

" = (15 ' 83) 

(1 +Z) 3 

From (15.38) we get 

( rr ^ = ( 2®-.) 5 ' 2 (?)’"■ |15 - 841 

Substitution into (15.82) gives 


dN _ c y [g 0 z + (g 0 - 1)(V1 + 2:q 0 - l)] 2 ndz 
\H 0 ) q*(l + r) 6 V 1 + 2q 0 z 


(15.85) 


In this form (15.85) is applicable to all Friedmann models, even though 
our derivation assumed qo > 1/2 and k = 1. 

This formula played a big role in the early development of observa¬ 
tional cosmology. In the 1930s Hubble expected to measure the curvature 
effects in the counts of galaxies, assuming that galaxies are uniformly 
distributed. He found out that the effect, if it exists, is too minute to 
be measurable. Two decades later radio astronomers attempted a similar 
study using powerful extragalactic radio sources. Here too the curva¬ 
ture effects became dwarfed by other variables such as the spectrum of 
luminosity of the sources, the evolution of density and luminosity with 
epoch, possible large-scale inhomogeneity of radio-source distribution, 
etc., etc. 


15.8 Cosmological models with the A.-term 

Although our concern in this chapter was mainly with the simplest 
Friedmann models, we now discuss briefly another class of models given 
by the modified Einstein equations (14.7) — the equations containing the 
cosmological constant X. We have already discussed two special cases of 
this class of solutions in the last chapter, namely the static Einstein model 
and the empty de Sitter model. When Hubble’s observations established 
the expanding-Universe picture, Einstein conceded that there was no 
special need for the 1-term in his equations. In the post-Hubble-law era, 
he dropped this term from his equations, and the Einstein-de Sitter model 
discussed in this chapter was the outcome of Einstein’s collaboration 
with de Sitter after abandoning the 1-term. 

Nevertheless, in the 1930s eminent cosmologists such as A. S. 
Eddington and Abbe Lemaitre felt that the 1-term introduced certain 
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attractive features into cosmology and that models based on it should 
also be discussed at length. In modern cosmology the reception given to 
the /.-term has varied from the hostile to the ecstatic. The term is quietly 
forgotten if the observational situation does not demand models based 
on it. It is resurrected if it is found that the standard Friedmann models 
without this term are being severely constrained by observations. The 
present compulsion for this term comes partly because of the observa¬ 
tional constraints and partly because inputs from particle physics in the 
very early stages of the Universe have provided a new interpretation for 
the A-term, which we shall discuss in Chapter 16. 

Putting X ^ 0, (15.10) and (15.11) are modified to the following: 


S S 2 + kc 2 

2 s + —s 2 


— Xc 2 


S 2 + kc 2 
52 



8 nG 


8 t zG 
~3c 2” 


T 1 
1 l ’ 

'T'O 
i 0 • 


(15.86) 

(15.87) 


The conservation laws discussed earlier are not affected by the A-term. 
If we restrict ourselves to dust only, (15.87) gives us the following 
differential equation in place of (15.28): 


S 2 3 3 S 3 

Similarly, (15.86) becomes 

S S 2 + kc 2 , 

2 S + —2 - ^ =°- ( 15 - 89 ) 

Let us first recover the static model of Einstein. By setting S = So, 
S = 0, S = 0 in (15.88) and (15.89), we get 


kc 2 1 , SjtGpo kc 2 , 

— r --Xc 2 = -—; —=\c 2 . 

S 2 3 3 S 2 

From these relations it is not difficult to verify that k — + 1, and we 
recover the relations obtained in Chapter 14: 


1 

' k ~ T 2 = 
Lc 2 


Po = 


AnG 


(15.90) 

(15.91) 


We shall denote by X — X c the critical value of X for which a static 
solution is possible. It was pointed out by Eddington that the Einstein 
universe is unstable. A slight perturbation destroying the equilibrium 
conditions (15.90) and (15.91) leads to either a collapse to singularity 
(S —»■ 0) or an expansion to infinity (,S -> 00 ). Eddington and Lemaitre 
instead proposed a model in which X exceeds X c by a small amount. In 
this case the Universe erupts from 5 = 0 (the big bang) and slows down 
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Fig. 15.10. Friedmann 
models with k = +1, X > 0 are 
shown here. The scale factor 
5(t) is plotted against the 
cosmic time t. 


near S = So, staying thereabouts for a long time and then expanding away 
to infinity. It was argued that the quasistationary phase of the Universe 
would be suitable for the formation of galaxies. This model is illustrated 
in Figure 15.10, which plots S(t ) for a range of values of X for k = +1. 
The initial (explosive) phase of the Eddington-Lemaitre model is shown 
along the section OP of the curve OPQR, with PQ the quasistationary 
phase and QR the final accelerated expansion. Notice that for X < X c 
the Universe contracts (as in the Friedmann case), whereas for X > X c it 
ultimately disperses to infinity, resembling the de Sitter universe. 

Figure 15.10 also shows by a dashed line one of another series of 
models that contract from infinity to a minimum value of S > 0 and then 
expand back to S —► oo. These models are sometimes called oscillating 
models of the second kind, to distinguish them from the models that start 
from and shrink back to S — 0 and are called oscillating models of the 
first kind. This terminology is, however, not quite apt, since there is no 
repetition of phases in these models as implied by the word ‘oscillating’. 

The models with k — 0 or k = — ldo not show these different types 
of behaviour for X > 0. We get from (15.88) a relation of the following 
type: 


S 2 


-kc 2 + -Xc 2 S 2 + 


%nGp 0 Sl 
3 S 


(15.92) 


wherein each term on the right-hand side is non-negative. Thus S does 
not change sign, and we get ever-expanding models. For X < 0, however, 
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Fig. 1 S.11 . A plot of £2 a 
against Qo f° r various 
Friedmann models is shown 
for various values of go- The 
'flat-model' line £2o + = 1 

is of special interest since 
inflation predicts flat models 
(see Chapter 16). 


we can get universes that expand and then recontract as in the k — 1 case 
for X < X c . 

This concludes our discussion of the general dynamical behaviour of 
the A-cosmologies. We end this section by writing (15.88) and (15.89) 
at the present epoch in terms of Hq and go- Thus in place of earlier 
relations we have 


^ kc 2 1 , , 

Ho 1 ~ -Xc 2 = Hfr o. 


(1 - 2 q 0 )H 2 + 


kc 2 


- Xc 2 = 0. 


From these we get 


2 c 2 

= 2go + ^.X—^. 

■J £lc\ 


(15.93) 

(15.94) 

(15.95) 


Now there is no unique relationship between go and £2o: we have an 
additional parameter entering the relation. Note also that it is possible 
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Fig. 15.12. The age of the 
Universe plotted against £2o 
for various values of Q A . The 
age is seen to increase as £2 A 
is increased. Thus model A 
with Q A = 0 has the least age, 
while the case C for £2 A = 0.8 
has nearly double that age. 


to have negative qo, that is, an accelerating expansion, ifX>0. This is 
because the 1-term introduces a force of cosmic repulsion. 

Finally, if the Universe is spatially flat, i.e., k = 0, then the following 
rewrite of relation (15.93) can confirm the fact: 


By writing 


1Ac 2 

3 h! 


= 


the above relation is often expressed in the form 


(15.96) 


n 0 + «A = l. (15.97) 

See Figure 15.11 showing these relationships in the — £2 A plane. 

A comment is needed here to explain to the reader one reason why 
the /.-containing models are preferred these days. The measurements of 
Flubble’s constant and the estimates of ages of stars in globular clusters 
suggest that the ages of the 1 = 0 models are inadequate to accommodate 
the stellar ages. As Figure 15.12 shows, by having a positive cosmolog¬ 
ical constant, the age of the Universe can be increased. Therefore, the 
age constraint can be relaxed if A > 0. 
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With these remarks we wind up our discussion of relativistic cos¬ 
mological models dominated by dust. We will take up the radiation- 
dominated models in the following chapter. 


Exercises 

1. Verify the expressions for the Ricci tensor and the Einstein tensor for the 
Robertson-Walker line element. 

2. Deduce Equation (15.14) from Equation (15.15). 

3. Using the Einstein-de Sitter model, estimate the epoch at which the matter 
and radiation densities in the Universe were equal. For this calculation take 
Po = 10” 29 g cm” 3 and e 0 = 10” 13 erg cm -3 , and express your answer as a 
fraction of the age of the Universe. 

4. A galaxy is observed with redshift 0.69. Flow long did light take to travel 
from the galaxy to us if we assume that we live in the Einstein-de Sitter universe 
with Hubble’s constant = 70 km s” 1 Mpc” 1 ? 

5. In the Friedmann universe with q 0 = 1, a galaxy is seen with redshift z = 1. 
How old was the universe at the time this galaxy emitted the light received 
today? (Take H 0 = 100 km s” 1 Mpc” 1 .) 

6. A light ray is emitted at the present epoch in the closed Friedmann universe. 
Discuss the possibility of this ray making a round of the universe and coming 
back to its starting point. 

7. Invert the formula (15.67) to express z as a function of x = D\H 0 /c. Show 
that 


z = q 0 x - (q 0 - 1) (\/l + 2x - l) . 

Use this relation to show how the linear Hubble velocity-distance relation begins 
to fan out for cosmological models with different values of qo. 

8. Show why the Friedmann models with A = 0 do not have event horizons. 

9. The surface brightness of an astronomical object is defined as the flux 
received from the object divided by the angular area subtended by the object at 
the observation point. How does the surface brightness vary with redshift? 

10. Show from first principles that the angular sizes of astronomical objects 
of fixed linear size will have a minimum at z = 1.25 in the Einstein-de Sitter 
model. 

11. If in a Friedmann universe we have a fixed number of sources in a unit 
comoving coordinate volume and each source emits line radiation of fixed total 
intensity L 0 at frequency v, show that the radiation background produced by 
such sources at the present epoch will have the frequency spectrum .S(v)dv, 
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where S(v) = 0, for v > v, whereas for v < v 

c v 3 ' 2 

S(v) = —n 0 L 0 - - 

N o v 2 ^/2q a v - (2q 0 - l)v 

where n 0 is the proper number density of sources at the present epoch. 

12. Given that objects during the quasistationary phase of the Eddington- 
Lemaitre cosmology are now seen with the redshift z = 2, what can you say 
about the value of X? 


13. Deduce that the scale factor in the A-cosmology with k = 1 satisfies the 
differential equation 

4w(iw-, + r). 


where 


Y = 


2q 0 + 


2Xc 2 
3 Hi 


XH 

Hi 


3/2 



14. Write down an integral that gives the age of a big-bang universe for 1^0. 
Discuss qualitatively how the A.-term may be used to increase the age of the 
universe. 


15. In A-cosmology, what is the lower limit on the value of X given the value of 
40 ? 

16. Compute the invariants R, Rn R' k and Rikim R‘ klm for the Friedmami models 
and show that they all diverge as S —> 0. Is there an exceptional case? 

17. Give a general argument to show that, for sufficiently small S, the A-force 
is ineffective at preventing the spacetime singularity. 

18. Identify the region on the £2 0 — £2 A plane corresponding to accelerating 
models of the Universe. 


19. The steady-state model is described by the de Sitter line element and a 
constant density at all epochs from t = —oo to t = +oo. Does this model have 
an infinite sky background as calculated by Olbers? Verify your answer by direct 
calculation. 


20. Show that the line element 


ds 2 = e v dT 2 - e- 1 ’ d R 2 - R 2 (d0 2 + sin 2 (9 dc/> 2 ), 

where e v = 1 — (2 GM/R) — (XR 2 /3), describes a spherically symmetric dis¬ 
tribution of matter of mass M in an otherwise empty, asymptotically de Sitter, 
universe. Discuss the effect of the A-term on the Solar-System tests of general 
relativity. 



Chapter 16 

The early Universe 


16.1 The radiation-dominated Universe 

In the previous chapter we discussed simple cosmological models in 
which the contents were described as ‘dust’, i.e., pressure-free matter. 
We saw that the density p of the matter behaves as ~ iU 3 . We also saw 
briefly that, if radiation were present, its energy density would vary with 
the scale factor as 


1 

€ OC —. 

S 4 

At present the Universe is dust matter-dominated; but if we see the 
different rates at which p and e increased in the past, we find that there 
was an epoch in the past when the two energy densities were equal. Let 
us denote this epoch by its redshift r eq . This means that at this epoch the 
scale factor was a fraction 1/(1 + z eq ) of its present value. Figure 16.1 
illustrates the relative variations of matter and radiation densities. We see 
that, prior to this epoch, radiation dominated over matter in determining 
the dynamics of the Universe through the Einstein field equations. 

What about temperature? During the radiation-dominated era, the 
temperature was determined by radiation and a simple calculation shows 
how the temperature also might have been high. This calculation requires 
the assumption that at present we have a radiation density uq that is a 
relic of an early hot era. With this assumption, the radiation energy 
density at a past epoch S was given by (15.24): 


e 



( 16 . 1 ) 
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Fig. 16.1. Matter and 
radiation densities plotted 
against the redshifts of past 
epochs using a logarithmic 
scale. If the dark-matter 
density is ignored, the two 
were equal at the epoch of 
redshift around ~10 3 . 



We may therefore assume that in the early epochs the dynamics of 
expansion was determined by radiant energy rather than by matter in the 
form of dust and that these were high-temperature epochs. 

We illustrate the above ideas with a simplified calculation by assum¬ 
ing that the radiation was in blackbody form with temperature T, so 
that 


6 


aT 4 


(16.2) 


where a is the radiation constant. This means that in the early stages of 
the big-bang Universe 

T° = aT\ Tl = T 2 = r 3 3 = ~^aT 4 . (16.3) 

We also anticipate that the space-curvature parameter k will not affect 
the dynamics of the early Universe significantly, and set it equal to zero. 
Thus, from (15.11), 


S 2 SnGa . 

— = - T 4 . 

S 2 3c 2 

Further, from (16.2) and (16.1) we get 


(16.4) 


T = —, A — constant. 
S 


(16.5) 


Substituting (16.5) into (16.4) gives a differential equation for S that can 
easily be solved. Setting t — 0 at S — 0, we get 


S = 


3c 2 

32nGa 



1/2 


(16.6) 
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and, more importantly, 



Notice that all the quantities inside the parentheses on the right- 
hand side of the above equation are known physical quantities. Thus, by 
substituting their values into (16.7), we can express the above result in 
the following form: 

7k™ = 1.52 x lO 10 ^. (16.8) 

In other words, about one second after the big bang the radiation temper¬ 
ature of the Universe was 1.52 x 10 10 K. The Universe at this stage was 
certainly hot enough to have free neutrons and protons around, which, 
as the Universe expanded, cooled down to facilitate the formation of 
atomic nuclei. It was George Gamow who appreciated the significance 
of the early hot era and conjectured that all chemical elements found in 
the Universe were formed in a primordial nucleosynthesis process [59], 
In short, the Universe acted as a fusion reactor. 

The idea of a hot big bang, as the above picture is called, depends 
therefore on the assumption that there is relic radiation present today. 
Later in this chapter we will present the argument that the microwave 
background discovered in 1965 by Arno Penzias and Robert Wilson 
is that relic radiation. For the present we will accept this evidence as 
confirming Gamow’s notion of the hot big bang and proceed further. 


16.2 Primordial nucleosynthesis 

This being a book primarily on general relativity, rather than on cos¬ 
mology, we will rush through the description of how and when atomic 
nuclei were synthesized. For details we refer the reader to the companion 
volume on cosmology [53]. Here we summarize the important steps in 
this process. 


1 6.2.1 Distribution functions 

Assuming an ideal-gas approximation and thermodynamic equilibrium, 
it is possible to write down the distribution functions of any given species 
of particles like neutrons, protons, photons, etc. Let us use the symbol A 
to denote typical species A. Thus n A (P)dP denotes the number density 
of species A in the momentum range ( P , P + d P), where 


n A (P) = 


& A p2 

2tt 2 h 3 


r ( e a (P) - ju a \ 



(16.9) 



278 The early Universe 


In the above formula T is the temperature of the distribution, g A the 
number of spin states of the species and k the Boltzmann constant, 
while the equation 

E\ = c 2 P 2 + m\c* (16.10) 

relates the energy to the rest mass m A and momentum P of a typical par¬ 
ticle of species A. Thus for the electron g A — 2, for the neutrino g A = 1, 
m A — 0, and so on. The + sign in (16.9) applies to particles obeying 
Fermi-Dirac statistics (these particles are called fermions), while the — 
sign applies to particles obeying Bose-Einstein statistics (particles 
known as bosons). For example, electrons and neutrinos are fermions, 
whereas photons are bosons. The quantity ji A is the chemical potential 
of the species A. The number densities of these species are needed in 
order to work out their chemical potentials. Since photons can be emit¬ 
ted and absorbed in any amounts, it is normally assumed that fi y — 0. 
Present observations suggest that for baryons (B) the ratio 

N b Number density of baryons _ g _ 10 

N y Number density of photons 

is small compared with 1. The smallness of the baryon number density 
suggests that the number densities of leptons may also be small compared 
with Ny, and it is usually assumed that this hypothesis provides a good 
justification for taking q A = 0 for all species. We will assume that 
Ha — 0 for all species as a first approximation in our calculations to 
follow. We will come back to this assumption at a later stage when it 
may need modification. 

We then get the following integrals for the number density (/V A ), 
energy density (e A ), pressure (p A ) and entropy density (s A ) of particle 
A in thermal equilibrium: 


n a = 

poo 

gA / 

P 2 dP 

(16.11) 

2n 2 V J 0 

exp [E A (P)/(kT)] ± 1 ' 

fA = 

poo 

gA / 

P 2 E a (P)AP 

(16.12) 

2n 2 h? J 0 

exp [E A (P)/(kT)] ± 1 ' 

Pa = 

poo 

gA / 

c 2 P 4 [P a (P)]-‘ dP 

(16.13) 

67T 2 h 3 J Q 

exp [E A (P)/(kT)\ ± 1 ' 

s A = 

(j>A + € A )T. 


(16.14) 


We can deduce a simple relation from these formulae to show that 
the entropy in a given comoving volume is constant as the Universe 
expands. Differentiate p A with respect to T to get 

dpA = g A r c 2 P 4 exp[E A (P)/(kT)\dP 
d T 6n 2 N l [exp[E A (P)/(kT)]±l} 2 kT 2 ' 
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Now integrate by parts to get for the above integral 


gA 


[3P 2 E A + c 2 P 4 E A ‘]dP p A + e A 


6n 2 h 3 T 


exp [E A (P)/(kT)\ ± 1 T 

On defining the pressure, energy density and entropy density for a mix¬ 
ture of such gases in thermodynamic equilibrium by 


P = J2 p A’ 


= E 




= E 5a ’ 

A 


(16.15) 


(16.16) 


we have the following relation: 

dp _ p + e 
d T ~ T 

We will apply this relation next in the expanding Universe. For we 
shall see that as the Universe expands it cools adiabatically. 

We first recall the conservation law satisfied by e and p in the early 
stages of the expanding Universe, the law given by (15.14), 


—(eS 3 ) + 3 P S 2 = 0, 


(16.17) 


and use it in conjuction with Equation (16.16). A simple exercise in 
calculus leads to the conclusion that the entropy in a given volume is 
constant: 


= S 3 ^ ^ ^ = constant. 


(16.18) 


In general, the above expressions become simplified for particles 
moving relativistically. In this case, the mean kinetic energy per particle 
far exceeds the rest-mass energy of the particle, an inequality expressed 
by 

„2 


T » 


m A c 


= T a . 


(16.19) 


This is called the high-temperature approximation, or the relativistic 
limit. 

The thermodynamic details for the various species of interest are 
given in Table 16.1. The numbers are expressed in units of the quantities 
for the photon (gA = 2; the symbol for the photon is y): 


N y = 


2.404 /£7A 3 

\~di) 


_ n 2 (kTy 

6y “l5 Wc^~ 3Py ' 


4n 2 k / kT \ 3 
”45 ~\~ch) ' 
(16.20) 


Now consider a primordial mixture of bosons and fermions all mov¬ 
ing relativistically. The effective energy-density-temperature relation¬ 
ship for this mixture will be 


e = -gaT , 


(16.21) 
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Table 16.1. Thermodynamic quantities for various particle species at 
T » T a 


Particle species A 

Symbol 

Ta (K) 

gA 

N A /N y 

e A /e y 

Sa/S y 

Photon 

Y 

0 

2 

1 

1 

i 

Electron 

e~ 

5.93 x 10 9 

2 

3/4 

7/8 

7/8 

Positron 

e+ 


2 

3/4 

7/8 

7/8 

Muon 


1.22 x 10 12 

2 

3/4 

7/8 

7/8 

Antimuon 

P + 


2 

3/4 

7/8 

7/8 

Muon and electron 

V,a,V e 

0 

1 

3/8 

7/16 

7/16 

neutrinos and their 



1 

3/8 

7/16 

7/16 

antineutrinos 







Pions 

7t + 


1 

1/2 

1/2 

1/2 


7t - 

1.6 x 10 12 

1 

1/2 

1/2 

1/2 


7t° 


1 

1/2 

1/2 

1/2 

Proton 

P 

10 13 

2 

3/4 

7/8 

7/8 

Neutron 

n 

T n — 7p 

2 

3/4 

7/8 

7/8 



~ 1.5 x 10 10 






where the ‘g’ factor is related to the total bosonic internal degrees of 
freedom gb and fermionic degrees of freedom gf by 

g = gb+^gf- (16.22) 

The reason becomes clear when we look at the last-but-one column 
of Table 16.1. The fermionic energy densities carry an extra factor 7/8. 

In this approximation consider the electrical potential energy of any 
two electrons separated by distance r. This is given by 



r 


Now the average inter-electron distance is given by N e 1/3 ~ ch/(kT). 
Thus the average interaction energy is 

e 2 

(U) ~ t- kT. 
tic 

However, kT measures the energy of motion of electrons. Thus the 
interaction energy is e 2 /{hc) ~ 1/137 of the energy of motion. Since 
the fraction is small, we are justified in treating the electrons as free gas. 




16.2 Primordial nucleosynthesis 281 


In contrast, at low temperatures T < 7a we have for all species with 
m A ^ 0 




e A = m A N A , 


p A = N A kT, 


s a = 


m A N A 2 
———c . 


(16.23) 


Notice that with a fall in temperature all these quantities drop off rapidly. 
We will often referto this limit as the non-relativistic approximation. (For 
the photon and a zero-rest-mass neutrino T A — 0 and this approximation 
never applies.) 

When applying these results to cosmology, the following consid¬ 
erations usually count. First, the expansion of the Universe is con¬ 
trolled by the species that are in the relativistic limit, for these are 
the particles that are present in greater abundance. Heavier species 
are reduced in number because of the exponential damping term of 
Equation (16.23). Thus, as the temperature of the Universe drops with 
expansion, the heavier species progressively diminish in dynamical 
importance. 


1 6.2.2 Decoupling of neutrinos 

In general, our understanding of the early epochs of the Universe tells us 
that there are two processes going on at any given time: the expansion 
of the Universe with a characteristic rate given by the Hubble constant 
H(t ) — S/S and some process involving the interaction of its particle 
species. If the latter is slower than the former, the process ceases to 
have any important role in determining the physical properties of the 
Universe. After the Universe has cooled through to a temperature of 
~ 10 11 K, the first major event to occur because of such a reason is the 
decoupling of neutrinos. 

Using the properties of the weak interactions of physics, one can 
show that the rate of interaction of neutrinos with leptons (electrons, 
positrons, muons, neutrinos etc. ) is of the order 

= g 2 /r 7 c- 6 (kr) 5 exp(-y)- (16-24) 

Here Q is the weak-interaction constant. We must now take note of 
the other rate mentioned earlier, that is relevant to the maintenance of 
equilibrium of neutrinos - the rate at which a typical volume enclosing 
them expands. From Einstein’s equations we get 


H 2 = 




8 nG 


16tt 3 G 
90 V c 5 


(kT)\ 


(16.25) 
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H, the Hubble constant at the particular epoch, measures the rate of 
expansion of the volume in question. Thus the ratio of the reaction rate 
to the expansion rate is given by 



(16.26) 


(16.27) 


Here we have substituted the values of G, h , Q, c, k and 7) t and arrived 
at the above numerical expression. Further, we have written the tempera¬ 
tures using the compact notation that T n indicates temperature expressed 
in units of 10" K, i.e., 


t -Wk- 

What does (16.27) tell us? As the temperature drops below 10 12 K, 
the exponential decreases rapidly. This means that the reactions involv¬ 
ing neutrinos run at a slower rate than the expansion rate of the Universe. 
The neutrinos then cease to interact with the rest of the matter and there¬ 
fore drop out of thermal equilibrium as temperatures fall appreciably 
below T\ 2 = 1. How far below? 

The original theory of weak interactions suggested that this tem¬ 
perature may be about T\\ — 1.3. In the late 1960s and early 1970s 
successful attempts to unify the weak interaction with the electromag¬ 
netic interaction led to additional (neutral-current) reactions that keep 
neutrinos interacting with other matter at even lower temperatures. The 
outcome of these investigations is that the neutrinos can remain in ther¬ 
mal equilibrium down to temperatures of the order of T w — 1. 

However, even though neutrinos decouple themselves from the rest 
of the matter, their distribution function still retains its original form with 
the temperature dropping as T oc S _1 . This is because as the Universe 
expands the momentum and energy of each neutrino fall as S" 1 and the 
number density of neutrinos falls as S~ '\ Since the temperature of the 
rest of the mixture also drops as S -1 and since the two temperatures 
were equal when the neutrinos were coupled with the rest of the matter, 
the two temperatures continue to remain equal even though neutrinos 
and the rest of the matter are no longer in interaction with one another. 
These remarks about neutrinos are meant to apply to all four species 
'V e ,’Ve,'Vn and'v. 
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16.2.3 Electron-positron annihilation 

There is, however, another (later) epoch when the neutrino temperature 
begins to differ from the temperature of the rest of the matter. First 
consider the Universe in the temperature range Tu = 1 t° Tio = 1- 
In this phase we have the neutrinos, the electron-positron pairs and 
the photons, each with distribution functions in the high-temperature 
approximation (see Table 16.1). Thus, referring back to the formula 
(16.21), we get 


6 = -aT\ 
2 


(16.28) 


Thus in this period the expansion equation is modified from our 
simplified formula (16.4) (for photons only ) to 


S 2 12j xGc 


S 2 c 2 

and the relation (16.7) is changed to 

r 2 \ i/4 


/ c z y>* _ 
T ~ V487T Go) f 


1/2 


which we may rewrite as 


(16.29) 


(16.30) 


Tin = 1.04 f. 


- 1/2 


(16.31) 


However, in the next phase the situation becomes complicated, 
because, with the cooling of the Universe, the electron-positron pairs 
are no longer relativistic. Thus the high-temperature approximation is 
no longer valid for them. As they slow down, they annihilate each other. 
We will not go into the details of this phase but instead jump across to its 
end, when the pairs have annihilated, leaving only photons (and possibly 
any excess electrons): 

e~ + e + ->■ y + Y 


Thus the energy, originally in e^ 1 and photons, is now vested only in 
photons, raising their number and temperature. How can we evaluate 
this change? It is here that Equation (16.18), telling us of the constancy 
of a, comes to our help. 

In the relativistic phase (Tg > 5) of e* we have 

° = 3T Y + €e+ + ey ) = Y a(ST) ' (1632) 

When the e" 11 have annihilated and left only photons, we have the 
photon temperature T y given by 

453 4 rer 

a = -i,Y y ey = r (STy) - 


(16.33) 
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We now use the result that the neutrino temperature always declines as 
S~ l . Let us write it as 

B 

T, = B = constant. (16.34) 

S 

Then (16.32) gives 

a = T aB \%)- (1635) 

Similarly (16.33) gives 

116361 

Now, in the pre-annihilation era T = TV, so that (16.35) tells us 
that a = (I \ /3)a B'. After annihilation a must have the same value, so 
we may equate it to the value given by (16.36). Thus we arrive at the 
conclusion that the photon temperature at the end of e* annihilation has 
risen above the neutrino temperature by the factor 

tHt) = 1A |16J71 

So the present-day neutrino temperature is lower than the photon tem¬ 
perature by the factor (1.4) 1 . If we take the latter to be ~2.7 K, the 
former is ~1.9 K. 


16.2.4 The neutron-to-proton number ratio 

We have so far developed a picture of the early Universe that is best 
expressed in the form of a time-temperature table of events, as shown in 
Table 16.2 (see also Figure 16.2). We will now be interested in the last 
entry of Table 16.2. 

In our discussion so far we have not paid much attention to baryons - 
the protons and neutrons that are also present in the mixture. In our 
approximation of setting the chemical potentials to zero we took the 
baryon number to be zero. The validity of the approximation depended 
on the baryon number density being several orders (8 to 10) of magnitude 
smaller than the photon density. Nevertheless, we must now take note 
of the existence of baryons, howsoever small their number density; for 
we need them in order to consider Gamow’s idea of nucleosynthesis in 
the hot Universe. We also emphasize that the baryons at this stage of 
the Universe (when nucleosynthesis could occur) are not playing any 
significant role in determining the expansion of the Universe. 

Insofar as chemical potentials are concerned, we will take explicit 
note of them in the following section. However, first notice that the criti¬ 
cal temperatures T n and T p of Table 16.1 are very high, so the neutron and 
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Fig. 16.2. The time- 
temperature relationship since 
the Universe was aged about 
10~ 4 s, until it became ~ 10 3 s 
old. After the annihilation of 
e^ 1 pairs (when the Universe 

Table 16.2. A time—temperature table of events preceding was 1-10 s old) the photon 

, ITT- temperature went up above 

nucleosynthesis in the earlv Universe ^ v 

the neutrino temperature by a 


Time since 

Temperature 


factor ~ 1.4. 1 hat is why the 
t-T curve splits into two parts. 

big bang (s) 

(K) 

Events 



<1(T 4 

>10 12 

Baryons, mesons, leptons and photons in 
thermal equilibrium. 

KT 4 -1(T 2 

10 12 —1 o 11 

p + begin to annihilate and disappear from the 
mixture. Neutrinos begin to decouple from 

the rest of the matter. 

10 2 -l 

© 

r 

© 

o 

Neutrinos decouple completely, 
e* pairs still relativistic. 

1-10 

10 10 -10 9 

The e* pairs annihilate and disappear, raising 
the photon-gas temperature to ~ 1.4 times 
the temperature of neutrinos. 

10-180 

10 9 -10 8 

Nucleosynthesis takes place. 
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proton distribution functions follow the non-relativistic approximations 
given by (16.23). Thus we get 



2 , 

< m p kT' 

3 /2 . 

T p\ 

N r -- 

= h? ^ 

\ 2?r , 

) ex P^ 

-?) 

N a = 

2 , 

>m a k.T' 

3 /2 , 

T n \ 

= h? ( 

K 2ti , 

) ex P^ 

-t) 


(16.38) 


In this approximation the neutron-to-proton number ratio is given by 



(16.39) 


The ratio therefore drops with temperature, from near 1 : 1 at T > 10 12 
K to about 5 : 6 at T — 10 11 K, and to 3 : 5 at 3 x 10 10 K (m p ~ »; n ). 

The thermodynamic equilibrium of these ‘heavy’ particles is main¬ 
tained so long as their weak interaction with the light particles is signif¬ 
icantly fast (compared with H(t)). Detailed calculations show that the 
cross section of this interaction goes as T and the effective decoupling 
temperature T 7 * at which the reaction rate is just about equal to H is 
<10 10 K. Note that, if the Universe were expanding faster, T* would be 
higher and the ratio N n /N v at decoupling as given by (16.39) would be 
higher. We will recall this point when relating the helium abundance to 
the number of neutrino species. 

Once the thermodynamic equilibrium ceases to be maintained, the 
N„/N p ratio is given not by (16.39) but by detailed consideration of 
specific reactions involving the nucleons. 

Thus the ratio of neutrons to protons is uniquely determined at the 
time nucleosynthesis begins, once we know all the parameters of the 
weak interaction. This is one good aspect of primordial nucleosynthesis 
theory, which was first pointed out by Chushiro Hayashi in 1950 [60]. 
We now proceed to discuss its outcome. 


16.3 The formation of light nuclei 

The process of nucleosynthesis may be considered as a battle between 
high-speed nuclei flying about in all directions and the strong nuclear 
force of attraction trying to trap them and bind them together. Clearly, 
at the higher temperatures the former win, whereas at lower tempera¬ 
tures the latter prevails. This can be seen quantitatively in the following 
way. 

A typical nucleus Q is described by two quantities, A, the atomic 
mass, and Z, the atomic number, and is written |Q. This nucleus has 
Z protons and (A — Z ) neutrons. If iiiq is the mass of the nucleus, its 



16.3 The formation of light nuclei 287 


binding energy is given by 

Bq — [Zm v + (A — Z)m n — Wq]c 2 . (16.40) 


Let us now consider a unit volume of cosmological medium con¬ 
taining TVn nucleons, bound or free. Since the masses of protons and 
neutrons are nearly equal, we may denote the typical nucleon mass by 
m. Thus m n ~ m p = m. If there are N n free neutrons and N p free protons 
in the mixture, the ratios 


X n 


Nn 

Ns’ 




fVp 

Ns 


(16.41) 


will denote the fractions by mass of free neutrons and free protons. If a 
typical bound nucleus Q has atomic mass A and there are Nq of them 
in our unit volume, we may similarly denote the mass fraction of Q by 


X Q = 


NqA 
Ns ' 


(16.42) 


Now, at very high temperatures (T 10 10 K), the nuclei are 
expected to be in thermal equilibrium. However, because of their rela¬ 
tively large masses, even at these tempereatures T 7 q and the non- 
relativistic approximation holds. Further, since we are now concerned 
with relative number densities, we can no longer ignore the chemical 
potentials. Thus we have 


<>*■«> 

where we have reinstated the chemical potentials / xq . Since chemical 
potentials are conserved in nuclear reactions, 


flQ = Z[X p + (A- Z)jX a , 


(16.44) 


assuming that the nuclei were built out of neutrons and protons by 
nuclear reactions. 

Using Equation (16.44), the unknown chemical potentials can be 
eliminated between (16.43) and similar relations for N v and N„. The 
result is expressed in this form: 


where 


Xq = \gQA 5l2 X z v Xl- z k A - 1 exp(^) , (16.45) 


5 - H 


mkT \“ 3 / 2 


2nh 2 ) 


(16.46) 


For an appreciable build-up of complex nuclei, T must drop to a low 
enough value to make exp [Bq/{1<:T)\ large enough to compensate for 
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the smallness of 1 . This happens for nucleus Q when T has dropped 
down to 


Tq 


Bq 

k(A- l)|lnf[' 


(16.47) 


Let us consider what happens when we apply the above formula to 
the nucleus of 4 He. This nucleus is made by fusion of four nucleons, i.e., 
it requires a four-body encounter. The binding energy of this nucleus is 
given approximately by 4.3 x 10 -5 erg. If we substitute this value into 
(16.47) and estimate TVn from the currently observed value of nucleon 
density of about 10” 6 cm^ 3 , we find that Tq is as low as ~3 x 10 9 K. 
However, at this low temperature the number densities of participating 
nucleons are so low that four-body encounters leading to the formation 
of 4 He are extremely rare. Thus we need to proceed in a less ambitious 
fashion in order to describe the build-up of complex nuclei. 

Hence we try using two-body collisions (which are not so rare) to 
describe the build-up of heavier nuclei. Thus deuterium ( 2 H), tritium 
( 3 H) and helium ( 3 He, 4 He) are built up in a sequence via reactions like 


p + n -o- 2 H + y, 

2 H + 2 H 3 He + n«* 3 H + p, (16.48) 

3 H + 2 H 4 He + n. 


Since formation of deuterium involves only two-body collisions, it 
quickly reaches its equilibrium abundance as given by 

X d = -^X p X n ?exp(^). (16.49) 

However, the binding energy B& of deuterium is low, so, unless T drops 
to less than 10 9 K, X& is not high enough to start further reactions 
leading to 3 H, 3 He and 4 He. In fact the reactions given in (16.48), with 
the exception of the first one, do not proceed fast enough until the 
temperature has dropped to ~ 8 x 10 8 K. 

Although at such temperatures nucleosynthesis does proceed rapidly 
enough, it cannot go beyond 4 He. This is because there are no stable 
nuclei with A — 5 or 8, and nuclei heavier than 4 He break up as soon 
as they are made. Their primordial abundances are extremely small. So 
the process effectively terminates there. Detailed calculations by several 
authors have now established this result quite firmly. 

So, starting with primordial neutrons and protons, we end up finally 
with 4 He nuclei and free protons. All neutrons have been gobbled up 
by helium nuclei. Thus, if we consider the fraction by mass of primor¬ 
dial helium, it is very simply related to the quantity X n - the neutron 
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concentration before nucleosynthesis began. Denoting the helium frac¬ 
tion by mass by the symbol Y, we get 

Y = 2X a . (16.50) 

In Figure 16.3 the cosmic mass fractions of 4 He, 'He, 2 H and other light 
nuclei are plotted against a parameter ij defined by 


( p0 ) 

( 2J \ 

V 1.97 x l(D 26 gcm~ 3 y 

V T 0 ) 


Thus rj essentially measures the nucleon density in the early Universe 
through the formula 

p = j;r 9 3 , T g < 3. (16.52) 

We will shortly discuss the implications of this parameter for primordial 
production of light nuclei. 


Po( 3/7 o) 3 9 cm 3 
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Fig. 16.3. Mass fractions of 
light nuclei produced in the 
early Universe for various 
values of the parameter /j 
related to baryon density. The 
production of deuterium 
drops steeply as r) approaches 
a value in excess of 10~ 4 . 
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1 6.3.1 Helium abundance and the number 
of neutrino species 

Note that the 4 He mass fraction is insensitive to the parameter q. This 
is because, as we saw just now, it depended only on X n , which in turn 
depends more critically on the epoch when the rate of weak interactions 
fell below the expansion rate. If we go back to (16.39), we see that in 
the very early stages the neutron-to-proton ratio was determined by the 
decoupling temperature 7). A faster expansion rate implies that the ratio 
became frozen at a higher temperature and so was higher, thus leading 
to a higher 4 He abundance. 

To see the effect quantitatively, recall from (16.39) that there was 
a ‘last epoch’ of temperature T 7 * when the neutron-to-proton ratio was 
determined from considerations of thermodynamic equilibrium: 



The temperature 7) was determined by equating the Hubble constant H 
to the reaction rate q for n ** p conversions. Now 

H oc g l/2 T 2 and q oc T* 

so that 

T 2 <xg x ' 2 . (16.54) 


Example 16.3.1 Problem. How does 7) depend on the number of neutrino 
species? 


Solution. First we note that T, is obtained by equating H with q. Given the 
relations above, we have 

T, = Pg l/ \ 


where is a known constant. 

Now suppose that there are r neutrino species, r > 3. Taking the pri¬ 
mordial brew to contain photons (y) and electrons (e) and neutrinos as other 
relativistic particles, we have go = 2 and g{ = 2 + 2r, where we also include 
antineutrinos. Thus 


g=2 + 7 -(2r + 2)=^ + f. 


Since the neutron-to-proton ratio is given by 


x = exp 



= exp 
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say, with o' = 1.5 x 10 10 , the above relations lead to 


x = exp 


Ir + 15 


- 1 / 4 -, 


For r = 3 we have x = 1/7, corresponding to Y = 1/4. Numerical calcu¬ 
lations for r = 4 and r = 5 lead to Y = 0.27 and 0.29, respectively. Since 
most estimates of primoridal helium put Y < 0.25, such higher values are 
ruled out. 


The calculation shown above tells us that an increase in the number 
of neutrino species would result in an increase of Y. 

This result is relevant to the question of how many different types 
of neutrinos exist primordially. The formalisms used by particle physi¬ 
cists allow for three or more neutrino types, w e , and "v T . Having more 
types of neutrinos existing forces the value of Y upwards. When we look 
at observations, we discover that the present estimates of helium abun¬ 
dance, Y < 0.25, rule out the existence of more than three neutrino types. 

It is also interesting that the result from particle-accelerator exper¬ 
iments appear to lead to the same conclusion. A series of experiments 
carried out in 1990 with the large electron-positron collider (LEP) at 
CERN produced the intermediate Z° boson [61] in large numbers. The 
presence of these particles (which mediate in electro-weak interactions) 
could be inferred by detecting resonance peaks in the energy-dependent 
cross sections for producing hadrons and leptons. The width of the peak 
measures the lifetime of the Z° boson, and this in turn can be linked 
to the number of neutrino species present. The estimate is very close 
to 3, which is consistent with the above cosmological considerations. 
This circumstance is considered a notable success of the enterprise of 
bringing together cosmology and particle physics. 


16.3.2 Deuterium abundance and 
non-baryonic dark matter 

In contrast to the behaviour of Y, which does not sensitively depend on 
the parameter rj, the abundances of other light nuclei do depend on r]. 
These abundances are very small compared with Y. The most interesting 
situation exists for deuterium, whose abundance sharply drops as ?; rises 
above 10~ 4 (see Figure 16.3). The present estimate of the deuterium 
mass fraction is ~ 2 x I IT 5 . From Figure 16.3, we have r/ ~ 2 x 10 
to understand the deuterium abundance. For T a = 2.7 K, this value of;; 
corresponds to a present nucleonic density of 


p Q ~ 4 x 10 31 gem 3 . 


(16.55) 
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On comparing this with (15.32) and (15.40), we see that o < 0.02 
and hence qo < 0.01. Therefore, if even such a small amount of deu¬ 
terium believed to be primordial in origin were found, Friedmann models 
of the closed variety would be ruled out. There is, however, a loophole 
in this argument: we can still accommodate non-baryonic matter in the 
Universe. Such matter does not affect the deuterium abundance, but 
contributes to £2o. Matter of this kind will have to be dark. 

To summarize, the process of primordial nucleosynthesis delivers 
the right abundance of helium, if the parameter q is properly adjusted, 
and of deuterium, if one allows for a substantial presence of non-baryonic 
dark matter in the Universe. Apart from marginal production of other 
light nuclei up to lithium, the pimordial process fails for the production 
of all other nuclei. For these one has to invoke stellar nucleosynthesis, 
which does produce the right amounts of these remaining (over 200) 
isotopes [62]. One may be tempted to invoke Occam’s razor, and argue 
that all isotopes were produced inside stars. Such an attempt, however, 
has not succeeded so far: one gets an inadequate quantity of helium in 
this way, and no deuterium at all. Interestingly, these two failure points of 
stellar nucleosynthesis are precisely those where the primordial process 
succeeds. 


16.4 The microwave background 

The era of nucleosynthesis took place when the temperature was about 
10 9 K. The Universe in subsequent phases continued to cool as it 
expanded, with the radiation temperature dropping as 5 _1 . The pres¬ 
ence of nuclei, free protons and electrons did not have much effect on 
the dynamics of the Universe, which was still radiation-dominated. Flow- 
ever, these particles, especially the lightest of them, the electrons, acted 
as scattering centres for the ambient radiation and kept it thermalized. 
The Universe was therefore quite opaque to start with. 

Flowever, as the Universe cooled, the electron-proton electrical 
attraction began to assert itself. In detailed calculations performed by 
P. J. E. Peebles [63], the mixture of electrons and protons and of hydro¬ 
gen atoms was studied at varying temperatures. Because of Coulomb 
attraction between the electron and the proton, the hydrogen atom has 
a certain binding energy B. The problem of determining the relative 
number densities of free electrons, free protons (that is, ions) and neu¬ 
tral FI atoms in thermal equilibrium is therefore analogous to that we 
considered earlier in deriving (16.45) for the mixture of free and bound 
nucleons. The only difference is that the binding to be considered now is 
electrostatic rather than nuclear. Following the same method, we arrive 
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at the formula relating the number densities of electrons (N e ), protons 
(N p = N e ), and H atoms (Na) at a given temperature T: 


N[ 

7V h 


/ m e kT\ 3 ' 2 

V 2jrh 2 J 


exp (-^)’ 


(16.56) 


where m e is the electron mass. This equation is a particular case of 
Saha s ionization equation. In about 1920 Meghnad Saha had looked at 
the problem of ionization in the context of stellar atmospheres and had 
derived this equation [64]. 

Writing N B for the total baryon number density, we may express the 
fraction of ionization by the ratio 


W 

N b ' 


Then, since Na = N B — N e , we get from (16.56) 

.AT \ 3/2 


(16.57) 


1 — x 


1 / m c kT\ i/2 ( B \ 

Nb KlnW) eXP (-^)' (16 ' 58) 


For the H atom, B — 13.59 eV. By substituting for various quantities on 
the right-hand side of (16.58), we can solve forx as a function of T. The 
results show that x drops sharply from 1 to near zero in the temperature 
range of ~5000 K to 2500 K, depending on the value of N B , that is, on 
the parameter ^o^o- F° r example, for Qo^o =0.01, x =0.003 at T — 
3000 K. 

Thus by this time most of the free electrons have been removed 
from the cosmological brew, and as a result the main agent responsible 
for the scattering of radiation disappears from the scene. The Universe 
becomes effectively transparent to radiation. This epoch is often called 
the recombination epoch, although the word ‘recombination’ is inappro¬ 
priate since the electrons and protons are combining for the first time at 
this epoch. It is more appropriate to call it the epoch of last scattering. 

The transparency of the Universe means that a light photon can go 
a long way (~ cj H) without being absorbed or scattered. Therefore this 
epoch signifies the beginning of the new phase when matter and radiation 
became decoupled. This phase has lasted up to the present epoch. During 
this phase, the frequency of each photon is redshifted according to the 
rule 


S 


while the number density of photons has fallen as 




(16.59) 


(16.60) 
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It is easy to see that under these conditions the photon distribution 
function preserves the Planckian form with the temperature dropping as 

roc^. ( 16 . 61 ) 

Does such a Planckian background exist in the Universe today? In 
1942 A. McKellar reported that the populated upper levels of the CN 
molecule in interstellar space led to the conclusion that there was a radi¬ 
ation background of ~2.3 K. This result [65] came during the Second 
World War when the normal channels of communication between scien¬ 
tists were closed. The result therefore went largely unnoticed. In 1948, 
Ralph Alpher and Robert Herman, junior colleagues of George Gamow, 
made the prediction that, since the hot Universe had cooled down, a 
blackbody radiation background of temperature about 5 K should exist 
now [66]. They made a guess of the present temperature, since it was not 
possible to tie down the radiation temperature to the present epoch from 
the physics of the early Universe. Since cosmology was considered a 
highly speculative field by the physicists, this important prediction was 
largely ignored. 

This radiation background was subsequently found by Arno Penzias 
and Robert Wilson, more or less serendipitously [67]. They had planned 
using a 20-foot horn-shaped reflector antenna to study radiation in the 
microwave range in the Milky Way. While testing the antenna, they 
pointed it in various directions and used the wavelength 7.35 cm because 
it did not attract much Galactic noise. These test measurements contained 
an unaccounted-for component that was isotropic, i.e., one that could 
not be ascribed to any specific Galactic or extragalactic source. It was 
only when they compared notes with the Princeton group that they 
could identify this radiation with the relic background. For, by 1964, 
Jim Peebles and Robert Dicke at Princeton had walked along the trodden 
path to arrive at the same conclusion as Alpher and Herman. To measure 
the predicted radiation background, Dicke was in fact building a suitable 
antenna in collaboration with his colleague David Wilkinson. 

The Penzias-Wilson measurement at one wavelength, if interpreted 
as blackbody radiation, gave the temperature 3.5 K. In Figure 16.4 
we show the spectrum of the radiation as measured by the Cosmic 
Background Explorer (COBE) satellite in 1990, with a temperature of 
2.735 ±0.06 K [68]. Besides, the background is extremely homoge¬ 
neous and isotropic, far more so than the observed distribution of matter 
in the Universe. The blackbody nature of the intensity-frequency curve 
has gone a long way towards confirming in most cosmologists’ minds 
the validity of the early hot-Universe picture discussed by Gamow. 
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In the next chapter we will look at some of the observations in 
cosmology and return to this topic: for the microwave background is 
currently the most important evidence in favour of the hot-big-bang ori¬ 
gin of the Universe. One aspect of the radiation that is still not explained 
is its present temperature of 2.7 K. This is sometimes stated in the form 
of the observed ratio of photons to baryons: 


Fig. 16.4. The precise 
Planckian spectrum for the 
microwave background 
obtained by the COBE 
satellite. The curve shown in 
the figure corresponds to that 
of a black body of 
temperature 2.735 ± 0.06 K. 


§ = 3-33 x 10 7 (n 0 A?) '(^) 3 - (16.62) 

This ratio has been conserved since the time at which the Universe 
became essentially transparent, although both N y and Nq can be studied 
theoretically at even earlier epochs. Why the above ratio and no other? 
Many physicists feel that deeper ideas from particle physics are needed 
to throw light on this mystery. 

There are other fundamental issues to tackle too. Some are stated 
below. 


1. Did the big bang really happen? 

2. Why does the Universe exhibit an apparent excess of matter over antimatter? 

3. Prior to the neutrons, protons, etc. assumed to be present at primordial 
nucleosynthesis, what existed in the Universe? 

4. How did the large-scale structure of galaxies and clusters evolve from tiny 
seeds? 
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There are other issues linked with inflation, dark matter, dark energy, 
etc. All these issues tempted the cosmologist to push his studies closer 
and closer to the big-bang epoch in order to try to understand what went 
on in those early moments. We will briefly describe this approach in the 
remaining part of this chapter. For details the reader is referred to the 
companion volume on cosmology [53]. 


16.5 The time-temperature relation 

We will continue our discussions of the early Universe by going back 
to the time-temperature relationship. We assume that our primordial 
mixture contained both fermions and bosons with total effective degrees 
of freedom g f and gb, respectively. Then we get the result that relates 
the temperature of the Universe to its expansion rate as given by the 
Einstein equation, 


S 2 8nG 
S 2 ~ ~i~ P ' 


(16.63) 


If there are bosons with a total gb of g-factors and fermions with a total 
gf of g-factors, then the above equation has the solution 

PC 2 =^gaT 4 (16.64) 

with 

7 

g = gb+-gf- (16.65) 

Thus we have for g = constant 

S oc t ,/2 (16.66) 


with 



3c 2 

l6nGa 


N !/ 2 

) g ^ 2 r-\ 


(16.67) 


Here t is the time since the big bang. This relation can be expressed as 


tsecond = 2.4g^ 1/2 r Me 2 v = 2.4 x 1 0~V 1/2 Tg v , (16.68) 


where we have used suitable conversion factors to write the temperature 
in MeV/GeV units. 

This equation gives us at a glance the average particle energy at any 
given time - the earlier the epoch, the higher the energy. In short, by 
going closer and closer to the big-bang epoch we get ever higher particle 
energies. 

This circumstance has prompted particle physicists with the idea 
that collaboration with big-bang cosmologists will be a good venture. 



16.6 Some conceptual problems 297 


For in particle physics there is a general expectation that at sufficiently 
high energies all basic physical interactions will be unified under the 
banner of one master reaction. The unification of the electromagnetic 
with the weak interaction in the late 1970s showed that the energies 
required for unification were of the order of 100 GeV The next step of 
‘grand unification’ of the electroweak theory with the strong interaction 
appears to require particle energies as high as 10 15 GeV Now the particle 
accelerators at CERN and Fermilab do not go beyond particle energies of 
the order of 1000 GeV So it is not really possible to check experimentally 
the claims of these grand unified theories (GUTs). However, if these 
theories are considered to apply to the very early Universe, then we can 
identify epochs when that happened. For example, setting 7 g cV = 10 15 
in Equation (16.68) gives, for g = 100, t = 2.4 x 10~ 37 s. It is arguable 
whether any physical meaning can be attached to such a short time 
scale. But if we do not worry about such operational issues, then in the 
very early Universe we do have a natural particle accelerator capable of 
reaching the GUT energies. From the realization that there is much to 
be gained both by cosmologists and by high-energy particle physicists 
from collaboration, the subject of astroparticlephysics was created. 


16.6 Some conceptual problems 

Going from the present-day cosmological scales to those prevailing in 
the very early Universe raises some conceptual difficulties, which we 
highlight first. 


1 6.6.1 The horizon problem 

Let us suppose that the initial conditions for the Universe were set fairly 
early on, at an epoch t in the radiation-dominated phase. From the 
considerations of Chapter 15 adapted to the scale factor S oc t 1/2 , we 
find that the proper radius of the particle horizon at that epoch was 

R P =2ct. (16.69) 

Whatever physical processes operated at this epoch were limited in 
range by R P . Hence we do not expect the homogeneity of physical quan¬ 
tities to extend beyond the diameter 2 R P , unless we make the somewhat 
contrived assumption that the Universe was created homogeneous. In 
other words, the causal limitations tell us that no region larger than 2 R P 
in size should be homogeneous. 

When the initial conditions were so set, the Universe would expand 
from them to a much larger size at the present epoch, the factor rj by 
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which it would grow being the ratio of scale factors 


= Sfo) 

" S(t) 

at the present and initial epochs. How do we estimate p? 

The simplest method is to compare the temperatures at t and to, 
since 5 oc T _1 . Thus 


ij = 


m 

T(t 0 V 


where T(t) is given by (16.67). It is convenient to express T 0 also in 
GeV: 


r 0 (GeV) = 2.3x KT 13 (^). 


(16.70) 


On combining (16.67) and (16.70) we get the present limit on a homo¬ 
geneous region as 


^Hom(^o) — 

= 6.2 x 10 17 x T-^g-'! 2 x cm. (16.71) 

For r GeV = 10 15 , g = 100 and T 0 = 2.7K we get the surprisingly small 
value of 62 cm! In other words, we have no reason to expect homogeneity 
on a scale larger than, say, 1 m. The fact that the relic microwave back¬ 
ground is homogeneous on the cosmological scale of ~10 28 cm tells us 
that there is something seriously wrong with our reasoning above. Yet, 
the standard model does not provide any loophole out of this so-called 
horizon problem. Notice also that the further we go back in the past (in 
our attempts to set the initial conditions) the larger will Toev be and the 
smaller will the value of 7?Hom(Io) be. Figure 16.5 illustrates the horizon 
problem. 


1 6.6.2 The flatness problem 

When discussing the early and the very early Universe we ignored the 
kc 2 /S 2 term in the field equations. Thus (16.63) should actually have 
been 


S 2 i kc 2 SnGp 
S 2 + H 2= 3 


(16.72) 


Our justification in ignoring that term was that, as S -> 0, S 2 —* oo 
and, thus, the first term far exceeds the second term on the left-hand 
side of (16.72). This argument is, however, scale-dependent. Thus, if we 
write S = At x/2 , then S 2 = A 2 /{At). Whether S 2 exceeds c 2 for A' = ±1 
would depend on A. A priori we do not know A. unless we link it with 
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Fig. 16.5. A and B are two 
typical observers of the very 
early Universe far enough 
apart that their particle 
horizons (shown by the past 
light cones) do not overlap. If 
the homogenization process 
took place very early in the 
Universe, then for causal 
limitation it will be locally 
limited to the respective light 
cones of A and B. How then 
could A and B achieve the 
same physical conditions 
around them today? If these 
horizons did limit the global 
homogenization process, then 
today A and B cannot be 
further apart than ~60cm. 
Why then do we find the 
Universe homogeneous on a 
scale ~10 28 cm? 


the present size of the Universe. It is more convenient to look at the 
density parameter £2 instead. 

Writing p = £2p c as in (15.40), we have, at any general epoch when 
5 oc t'!\ 


kc 2 S 2 Q-l 

— - (Q-l)— ~ 

For the present epoch, on the other hand. 


kc 2 , 

-rt=(n 0 -i )H 2 . 


(16.73) 


(16.74) 


On dividing (16.73) by (16.74) and using S oc T 1 we get, for k — ±1, 


£2 - 1 = (£2 0 - 1) ■ 47/ 0 2 t 2 ■ S. 

Except for (O 0 — 1) all quantities on the right-hand side are known. 
Using (16.68) for t and (16.70) for T 0 , we get 


(n - 1) = 4.3 hlg- 1 x io- 21 r^(^) 2 (n 0 - i). (16.75) 

For 7oev = lO 15 an d g = 100. w e get for Tq = 2.7 K 
Q — 1 = 4.3/i 5 x 10- 53 (fi 0 - 1). 


(16.76) 
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Fig. 16.6. The flatness 
problem described by the 
curves in the figure shows 
that, unless the model was 
initiated at the GUT epoch 
within the shaded region, 
today it would not exhibit a 
matter density comparable to 
what is observed. In short, it 
had to be very finely tuned 
around £2 = 1, to be 
consistent with modern 
studies; tuned to the extent of 
1 part in 10 53 . 



This expression epitomizes what has come to be known as the flat¬ 
ness problem. Suppose that the initial conditions including the density 
parameter Q were set at the GUT epoch when T = 10 15 GeV Then the 
present value of (Go — 1) is given by (16.76). Or, to invert the chain 
of reasoning, suppose that the present observational uncertainty tells us 
that l^o— 1| ~ 0(1). Then, from (16.76), at the GUT epoch £2 differed 
from unity by a fraction of the order of 1 O ' 53 . In other words, the depar¬ 
ture from the flat value of £2 (=1) at this stage had to be extremely small. 
Any relaxation of this fine tuning would have led to a far wider range of 
£2o at present than is permitted by observations. 

So our neglect of the curvature term kc 2 /S 2 is linked with an 
extremely fine tuning of the Universe to the flat (k = 0) model. If this 
tuning were not there, the Universe would either have gone into a col¬ 
lapse {k — 1) or expanded to infinity (k = —1) on time scales of the 
order of 10~ 35 s that were characteristic of the GUT era. 

Figure 16.6 illustrates this conundrum. The shaded region denotes 
the finely tuned set of Friedmann models that end up today within the 
observed range | — 11 ~ 0{\). The curves shown outside this region 

are the characteristic models with time scales ~10 -35 s which should 
normally have operated at the GUT stage. What made the Universe get 
into the shaded region instead? 

This problem was first highlighted by R. H. Dicke and R J. E. Peebles 
in 1979, who discussed it not at the GUT epoch but at t ~ 1 s when the 
neutrinos had decoupled and pair (e :L ) annihilation was to begin. Thus 
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T ~ 10“ 3 GeV, g ~ 10, and we get ~10~ 16 instead of 10~ 53 in (16.76). 
It is clear that the further back in time and closer to t — 0 we go the finer 
is the tuning required. For example, if we were to initialize the problem 
at the Planck epoch, we would get 10~ 61 for the tuning range instead 
of 10" 53 . 


1 6.6.3 The entropy problem 

This is a restatement of the flatness problem and the horizon problem 
in a somewhat different form. The entropy in a given comoving volume 
stays constant in an adiabatic expansion (see Section 16.2). The present 
photonic entropy in the observable Universe of characteristic size R ~ 
Hq 1 • 10 28 cm is given by 

E = ^atfR 3 « h~ a 3 x 4.4 x 10 87 (|^) 3 . (16.77) 

Why such a large value? If the entropy were conserved, we would have 
ST — constant. However, we found that in the flatness problem this 
hypothesis led to fine tuning, whereas for the horizon problem it gave an 
extremely small size of homogeneity. It therefore appears that the trouble 
lies in E = constant: it could be resolved if the adiabatic assumption 
were violated at some stage and E boosted to its present value by an 
enormously large factor. 


1 6.6.4 The monopole problem 

In a grand unified theory, whenever there is a breakdown of symmetry 
of a larger group like SU(5) to a subgroup like SU(3) x SU(2) L x U(l) 
that contains the U(l) group, there inevitably arise particles that have the 
characteristics of a magnetic monopole. This is a rigorous mathematical 
conclusion in gauge field theories. Typically the mass of the monopole 
(in energy units) is given by ~10 16 GeV Monopoles are highly stable 
particles and once created they are not destructible, so they would survive 
as relics to the present epoch. 

At the GUT epoch t, the horizon size being 2ct, we expect at least 
one monopole per horizon-size sphere, i.e., a monopole mass density of 

10 16 GeV/c 2 
(47r/3)(2c0 3 ' 

At present this is diluted by the factor ( Tq/T ) 3 . For To in GeV units, 
given by (16.70) and T = 10 15 GeV, we get the present monopole 
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density as 


Pu = 3 x 10 l3 (yy^) gem 3 . (16.78) 

This is far in excess of the closure density ~10“ 29 gem -3 , thus making 
it a very awkard problem for the standard model to solve. Again, as in 
the earlier cases, the discrepancy grows if, instead of the GUT epoch, 
we use an even earlier epoch. 


16.7 Inflation 

The extrapolation to such short time scales has thus brought its prob¬ 
lems. Four such problems have just been described. These problems are 
addressed by the notion of‘inflation’, a brainchild of three cosmologists, 
D. Kazanas, A. Guth and K. Sato, who independently arrived at it while 
working on astroparticle physics [69, 70, 71], The scenario of inflation 
has evolved a lot since its inception in 1980-81. Even today there is 
no unique commonly accepted fully worked-out model of inflation. Yet 
its consequences for the big-bang cosmology are attractive enough for 
most workers to accept on trust a half-baked idea. 

When the Universe cools down through the GUT epoch, a phase 
transition occurs when the single system described by grand unified 
theory splits into the electroweak and strong interaction. This phase 
transition can release a lot of energy, which is dumped into the dynamics 
of the Universe. 

An analogy will be in order to illustrate the scenario. Suppose steam 
is being cooled through the phase-transition temperature of 100 °C. 
Normally we expect the steam to condense to water at this temperature. 
However, it is possible to supercool the steam to temperatures below 
100 °C, although it is then in an unstable state. The instability sets in 
when certain parts of the steam condense to droplets of water, which 
then coalesce, and eventually the condensation goes to completion. In the 
supercooled state the steam still retains its latent heat, which is released 
as the droplets form. 

In the case of the Universe the ‘vacuum’ state is identified with 
the state of lowest energy. However, the meaning of lowest energy 
changes as the phase transition occurs. The water-steam analogy tells 
us that the ‘true’ state of lower energy is that of water and the super¬ 
cooled steam identifies a ‘false’ state that has higher energy. When the 
steam condenses, the false state changes to the true state. Likewise, in 
the case of the Universe, the phase transition may be delayed, leading to 
the existence of a false vacuum. When the transition is complete the true 
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vacuum state is attained. In the original Guth version ( which had fatal 
flaws) the switchover to true vacuum was through quantum tunnelling. 

Whichever the mechanism of transition from false to true vacuum 
(and that is specified by a potential function) the dumped energy leads to 
a rapid expansion of the Universe. Denoting the extra energy density by 
€q, we find that it must have dynamical effects via the Einstein equation: 


S 2 + kc 2 8t tG 

^^“^ (eo + 6r) - 


(16.79) 


Here e r oc 1 /S A is the energy density of radiation and relativistic par¬ 
ticles. Since e r falls as the Universe expands while eo stays constant, 
the latter clearly dominates. Hence we ignore e T and solve (16.79). For 
k — +1 we get, for example, 


S = 


( 


3c 4 

SnGe 0 


, 1/2 


cosh 


/8jrGe 0 \ 1/2 ' 
V 3c 2 ) 


(16.80) 


For k — — 1 we get a similar expression with ‘cosh’ replaced by ‘sinh’. 
The main point to note is that for 



either solution approaches closely the k — 0 (flat) solution 


(16.81) 


f 8ttG6o\ 1/2 

Socexpfa/), a= I— —— ) . (16.82) 

This exponential expansion is reminiscent of the de Sitter model. Indeed, 
the energy tensor of false vacuum simulates the Xgn- term of the Einstein 
equations. 

This rapid expansion in an exponential fashion continues until (in 
the original Guth version) the tunnelling takes place and </> attains its 
true vacuum value. The average time r for the tunnelling to occur can 
be computed quantum mechanically. It tells us the factor Z by which the 
scale factor S increased while inflation lasted. One finds that 


ax « 67, Z = exp(ar) « 10 29 . (16.83) 

In other words, the exponential expansion or inflation lasts long enough 
for the scale factor to blow up by a large multiple ~10 29 . Thus if we 
had started with a curvature term ( kc 2 /S 2 ) comparable to the expansion 
term (S 2 /S 2 ) prior to inflation we would end up by having the former 
reduced by Z 2 ~ 10 58 while the latter stays constant. This large factor 
Z not only takes care of the fine tuning in the flatness problem but also 
resolves the horizon problem (by blowing up the homogeneous region by 
a factor Z in linear dimensions) and the monopole problem (by reducing 
the monopole density by the factor Z 3 ). 
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For these advantages and more so because inflation initiates the 
structure formation in the Universe in a way that seems to lead to the 
right mass distributions at the present epoch, inflation has been accepted 
by cosmologists as a vital stage in the very early Universe. For the 
general-relativity purist the model leaves much to be desired. It is an 
approximate solution with no matching boundary conditions. That is, 
one does not know how exactly the conditions change across the bubble 
and its surroundings. Neither is the spatial extent of the initial bubble 
specified. The potential function that distinguishes the true vacuum from 
the false vacuum is also put in by hand rather than taken from some deep 
theory. 

We conclude our discussion of the early Universe here. At present 
astroparticle physics is the most active area in theoretical cosmology. 
Despite its speculative nature, its challenges invite theorists to try their 
own prescription for how the Universe began. Flowever, in the last analy¬ 
sis, a scientific theory must pass the test of observations. We will discuss 
a few important observations in the next chapter. 


Exercises 

1. Substitute the values of c, G and a into (16.7) and verify the numerical 
coefficient in (16.8). 

2. Taking the present-day temperature of the radiation background as 2.73 K and 
the present baryon density as 10~ 6 cm -3 , calculate the number ratio of photons 
to baryons. 

3. A primordial mixture of relativistic bosons and fermions in the early Universe 
of temperature T has the total energy density given by the formula 


e = 


30ft 3 c 3 


g*(kT) 4 . 


Show that g* = g b + (7/8)gf, where gb is the total spin degeneracy of all bosons 
and gf is the total spin degeneracy of all fermions. 

4. The binding energy of the 4 He nucleus is B = 4.3 x 1CU 5 erg. Show that 
for the nucleus B/[k(A — 1)] = 10 11 K. Next assume that the present value of 
the radiation temperature is 3 K and that of the nucleon density is 1CU 6 cm -3 . 
Using the result that N^T~ 3 = constant, show that (16.47) gives Tq for 4 Hc 
as ~3.2 x 10 9 K. 

5. If m is the mass of a nucleon and if Q 0 is the density parameter, show that the 
present number density of baryons is 3Hq£1 0 /(8ji Gm). Use this formula and 
the present microwave background temperature T 0 = 2.1 K to estimate in 
(16.58). Solve the Saha equation for = 0.1, /? 0 = 1 to show that, at 3000 K, 
x = 0.003. 
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6 . Using the Thomson-scattering cross section for the electrons, show that the 
optical depth of the Universe at the present epoch would be given by 0.08f2 0 /7o 
if all electrons in the Universe were free and equal in number to the baryons and 
there were no non-baryonic matter. 

7. Assuming that in the past the electron number density increased as (1 + z) 3 , 
use the analysis of Exercise 6 to estimate the smallest redshift at which the 
Einstein-de Sitter universe was opaque to radiation. (Take h 0 = 1.) Comment 
on the fact that your answer comes out very much lower than z ~ 1000. 

8 . Give arguments to show that the neutrino temperature drops as 5 -1 after 
neutrinos decouple from the rest of the matter. 

9. Why is the present neutrino temperature expected to be lower than the photon 
temperature? Derive the ratio of the two temperatures from considerations of 
the early Universe. 

10. Suppose we wish to apply flat-space statistical mechanics to the very early 
Universe at epoch t. The locally flat region may be characterized by a linear 
size L < act, where a 1. Estimate the number of relativistic species in this 
region using a time-temperature relationship. Show that, while this number is 

1 for the primordial nucleosynthesis era (t ~ 1—200 s), it is < 1 for the GUT 
era. Can flat-space statistical physics be applied at the GUT era? 



Chapter 17 

Observational cosmology 


We will now take a look at some of the tests of the relativistic cos¬ 
mological models discussed so far. We will confine ourselves to tests 
that bring out the general-relativity part of the model, rather than other 
aspects like astrophysics, particle physics, etc. For a more comprehensive 
discussion see [53]. 

17.1 The redshift-magnitude relation 

We saw in Chapter 14 that for small redshifts Hubble’s law holds. What 
is the form of this relation when redshifts are not small compared with 
unity? Formula (15.67) tells us the relationship between luminosity dis¬ 
tance D and redshift z for Friedmann models without the cosmological 
constant. 

In the practical form in which this test is often presented, astronomers 
use apparent magnitudes in place of distances. Thus the D-z relation 
becomes 


m — M — 5 log D — 5 



+ 5 log[g 0 z + (qo - 1)( \J 1 + 2q 0 = - 1)]. (17.1) 

A. R. Sandage and his colleagues spent a number of years on this 
cosmological test with the hope that the correct geometry of the Universe 
would be revealed. Although in the 1960s Sandage often quoted a value 
of qo & 1, it gradually became clear that a number of uncertainties 
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Fig. 17.1. The 

redshift-magnitude plot for 
the galaxies which are the 
brightest in their clusters. The 
plot is based on the work of 
Allan Sandage and his 
colleagues (A. Sandage et al. 

1978, Ap. /., 221,383) who 
showed that there is very little 
variation in the luminosity of 
brightest cluster members. 
Theoretical curves for various 
values of qg are superposed on 
the data points. 


combine to make this test rather inconclusive. A typical z—m curve 
obtained by Sandage is shown in Figure 17.1. The various errors and 
uncertainties that arise in practical applications of this test are many. 
Some of the issues have been understood and partially resolved; others 
continue to be difficult to settle. For these reasons this test of spacetime 
geometry fell into disfavour during the 1980s and early 1990s. 

Flowever, in the late 1990s a fresh attempt was made to revive this 
test when it was realized that Type la supernovae can be used to estimate 
m relatively unambiguously at redshifts as high as unity. 


1 7.1.1 The Hubble diagram using Type la supernovae 

During the 1980s, it was realized that Type la supernovae can serve as 
standard candles in the following way. The light curve of such a super¬ 
nova (cf. Figure 17.2) shows an approximately symmetric characteristic 
rise and fall over ~30 days, followed by a much slower decline. The 
maximum luminosity of a Type la supernova shows an almost uniform 
value for this population, the dispersion being no more than 0.3 mag¬ 
nitude. Going beyond that, however, we now see that, because of their 
high peak luminosity, they can be spotted in distant galaxies. Thus they 
are suitable for determining the z—m relation, out to redshifts of ~ 1 or 


even more. 
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Fig. 17.2. The rise and fall of 
light intensity during the 
explosion of a supernova of 
Type la. The peak intensity is 
expected to vary from one 
supernova to another, but 
within a rather narrow range. 



In 1988 the Supernova Cosmology Project (SCP) was launched and a 
systematic search for and observations of such supernovae were carried 
out by several observatories round the world, using telescopes in the 
4-m class. The Keck I and 11 telescopes were used for measurements of 
redshifts and spectral identifications, as was the ESO 3.6-m telescope. 
The database continues to grow. 

In 1999 Perlmutter et al. [73] used 60 supernovae to draw up the 
Hubble plot. Of these, 18 came from the work on nearby supernovae by 
Reiss et al. [72], These were used essentially to set the zero point of the 
plot, with the remaining 42 coming from the SCP with redshifts starting 
from 0.18 and going as far as 0.83. 

Theoretical Friedmann models with X — 0 can be applied to such 
data using the formulae for D(z, q< t ) from Chapter 15. However, their 
observational fits were not very satisfactory and the parameter space had 
to be expanded to include the cosmological constant. We briefly discuss 
the theoretical aspect of these models showing how the m—z relation can 
be derived numerically. The dimensionless parameters in question are 


f2o = 


SnGpo 

~3Hf’ 




Ac 2 

3 HZ' 


(17.2) 


these being respectively the density parameter and the cosmological- 
constant parameter. 

Using the formulae of Chapter 15, we can write the following relation 
between the radial Robertson-Walker coordinate r and redshift z: 


r(z) = 


rS(t o) 


'S«o)/(l+-) 


cdS 
SS ' 


(17.3) 
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It is not difficult to see that using Equation (15.92) we can write the 
above in the flat case (k — 0) as 


r(z) = 


c 

sjfo 


dx 


1 1 yj £2a ”1“ ^o-^ 3 


(17.4) 


Example 17.1.1 Let us reduce (17.3) to (17.4) for the case k = 0. From 
Equation (15.92) we get for k — 0 

8nGpoSg 


1 


S 2 = -Xc 2 S 2 + 
3 


3 S 


Therefore 


S 2 S 2 = Uc 2 S* + ^ p 0 S 3 S. 


Using the definitions (15.40) and (15.96), we get 


pa = 


8nG V S 2 


Xc 2 , _ 


Writing H 0 = [S/Slo, we replace X and p in our equation by £2 A and £2 0 , 
to get 

s 2 s 2 = s 4 h 2 + n 0 H 0 2 s 0 3 s. 

Using the relation S 0 /S = x we get 


S 2 S 2 = n A H 0 2 S 0 4 x~ 4 + a 0 H^x 


Hence 


cdS 

~s¥ 


cS 0 dx 1 


H 0 S; 


4 4- £2o.r 1 j 


-1/2 


H 0 S 0 

The r -integral therefore becomes 

|*1 +Z 

r{z) = 


(f2 A + £V 3 ) 


- 1/2 


cdx 


l Ho So \J £2a + £2o* 3 


with the luminosity distance as 

f djc 

D(z) = r(z)<So(l + z) = (1 + / =• 

Ho J i 4- £2 0 x 3 


Now recall that the luminosity distance D — rS(to)(l + z ) and we 
can write down, in the first approximation, the following z—m relation: 


m(z) = —2.5 log L + 5 log D + constant. 


(17.5) 
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Fig. 17.3. The contours on 
the a plot show the 

probability of validity of a 
model based on the 
supernova data. It is argued 
that the standard model is 
consistent with the available 
data, if not its best fit. (From 
Perlmutter et at. 1999, Ap. j., 
517, 565.) 





Of course, one has to correct this relation for the K correction and 
for other possible effects mentioned before. L, as already pointed out, 
contains a dispersion around the average standard-candle luminosity. 
In fitting a best-fit curve through the data the dispersions in apparent 
magnitudes have to be taken into consideration. Figure 17.3 shows how 
well the various theoretical models match the observations. (Here £7 m 
is the same as £2o.) 

Perlmutter et al. found that the simplest Friedmann model, namely 
the flat Einstein-de Sitter model, does not give a statistically satisfactory 
fit. The flat model, however, does fit well if a non-zero cosmological 
constant is allowed. That is, consistently with (15.97), i.e., 


£7^ + £7o — 1 ’ 


the model with £2o = 0-28 gives the best fit. This implies that a non-zero 
cosmological constant as high as 0.7 is needed. In other words, the 
Universe has a negative qo ~ —0.6, if we use the relation (15.95). Thus 
the Universe is accelerating. 

Clearly, in view of the profound significance of such a finding, 
careful follow-up is regularly being done. Several questions arise. How 
sure are we that there is no evolution in supernovae that would spoil 
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their standard-candle interpretation? Some four or five supernovae in 
the data have to be left out of the curve-fitting exercise because they lie 
far off from the best-fit curve: why? Could some other explanation rather 
than the cosmological constant account for the extra dimming found in 
the supernovae? The possible effect of dust in standard cosmology has 
also been invoked by some. They argue that the dust causes the extra 
dimming that is observed. We will discuss the role of intergalactic dust 
in the context of the quasi-steady-state cosmology when we return to this 
test in the final chapter. There we will also discuss how this test plays 
a complementary role in the parameter space in relation to the cosmic 
microwave background. 


17.2 Number counts of extragalactic objects 

The basic idea behind these tests is to find out whether the number 
counts reveal the non-Euclidean nature of the spacetime geometry of the 
Universe assumed by most models. We illustrate with a simple example 
from radio astronomy. Suppose we have a class of radio sources that are 
(1) uniformly distributed in space and (2) have the same luminosity L. 
If we further assume that (3) the Universe is of Minkowski type, that is, 
with Euclidean spatial geometry, the number of sources up to a given 
distance R will go as 


N<xR\ (17.6) 

while the flux density from the faintest of the sources up to distance R 
goes as 


ScxTT 2 . (17.7) 

By eliminating R between these relations, we get 

, , D log N 

N~S 3 — constant, that is - = —1.5. (17.8) 

d log S 

Thus (17.8 ) tells us how N and S are related under our three assumptions 
(1), (2) and (3). Under these assumptions N measures the volume and 
the radius of a spherical region centred on the observer, and (17.8) 
is simply the volume-radius relation in Euclidean geometry. 

Given the Robertson-Walker models, we can work out the corre¬ 
sponding relations in non-Euclidean geometries. It is therefore possible, 
in principle, to test whether the observed relation agrees with one of 
the various cosmological models. Unfortunately, as with the z-m test, 
various uncertainties prevent us from drawing a clear-cut conclusion, as 
we shall see with the counts of galaxies and radio sources below. 
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17.2.1 Counts of galaxies 

In 1936 Hubble attempted number counts of galaxies in order to dis¬ 
tinguish between model universes. However, he had to abandon the test 
because the number of galaxies to be counted is very large, and unless 
one goes fairly deep in space one cannot detect any significant depar¬ 
tures from Euclidean geometry. However, there is one difference from 
the formula (17.8). Since the optical astronomer measures fluxes in mag¬ 
nitudes, the corresponding relation describing the number N of galaxies 
brighter than apparent magnitude m becomes 


d log IV 
dm 


= 0 . 6 . 


(17.9) 


Any effect of non-Euclidean geometry will show up as a deviation 
from this straight-line relation, but these differences become noticeable 
only at redshifts as high as 0.5, say. By then one needs to identify and 
count millions of galaxies, even in a small sector of the sky. Although 
Hubble did not succeed, his programme was revived in recent years by 
a number of workers, who now have at their disposal many electronic 
and solid-state devices to facilitate galaxy counts to very faint magni¬ 
tudes (m ~ 24). For example, in 1979 J. A. Tyson and J. F. Jarvis first 
used techniques of automated detection and classification of galaxies 
on plates. Their main problem at faint magnitudes was to be able to 
distinguish stars from galaxies. 

Even though one may find ways round these practical problems, the 
outcomes of such counts are hard to interpret as deviations caused by 
geometry. Rather, evolutionary effects and inhomogeneities of sample 
galaxy populations chosen for counting dominate the observations. 


17.2.2 Counts of radio sources 

In comparison with galaxy counts, counts of radio sources have the 
advantage that the latter are not as numerous as galaxies. For this reason, 
after Hubble’s galaxy-count programme had come to nothing and as radio 
astronomy became established during the 1950s, it was felt that the time 
was ripe to have a go at the radio-source-count test. Radio astronomers 
also felt that strong radio sources could be seen at much further distances 
than galaxies, and hence they would provide more stringent tests on the 
large-scale geometry of the Universe. Also they are much rarer than 
galaxies, so there are not many of them to count. 

M. Ryle at Cambridge, B. Mills at Sydney and J. Bolton at Caltech 
did pioneering work on the source-count programme. Since the radio 
astronomer measures S over a specified bandwidth, he tends to plot 
log N against log S, where S is the flux density, the flux S received 
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over a frequency band divided by the bandwidth. The usual unit for 
S is the jansky (Jy) (named after Karl G. Jansky, who did pioneering 
work in radio astronomy in the 1930s), which equals 10~ 26 Wm~ 2 Hz^ 1 . 
Similarly, the power of the radio source is defined as luminosity over 
a unit frequency band per unit solid angle and is expressed in units of 
watts per hertz per steradian (W Hz -1 Sr -1 ). 

The early source counts led to considerable controversy and a lot 
of discussions, largely because the problem of interpreting the observed 
source counts was oversimplified. Several factors intervene to make a 
simple conclusion elusive. Some of them are as follows. 

1. Counts are affected by local large-scale inhomogeneities of the matter distri¬ 
bution. For example, a local void near us would make the log iV-log S curve 
steeper than Euclidean. 

2. There could be different types of sources, all mixed up in the survey. For 
example, quasars and radio galaxies mixed together would give misleading 
answers. These populations need to be counted separately. 

3. Evolution in number density or luminosity of the source population will 
easily mask any geometrical effect that is being looked for. 

4. The luminosity function needs to be known before the source count can 
be reliably made. This last point is important since, unlike the optical 
astronomer, the radio astronomer cannot measure the redshift of the radio 
source and so relies on the flux density to estimate its distance. 

The surveys today are much more sophisticated and accurate than 
the early pioneering radio surveys of 1955-65. (Figure 17.4 shows an 
example.) But they have also brought the realization that distinguishing 
between different geometries in this way is not possible. 

17.3 The variation of angular size with distance 

This test was briefly discussed in Chapter 15, where we saw that the 
angular size of an object of fixed projected linear size does not steadily 
decrease with its spatial distance from us. Figure 15.7 showed how the 
angular size changes with the redshift of the object in various Friedmann 
models. In 1958 F. Hoyle [58] first suggested that this property of non- 
Euclidean geometries could in principle be tested by radio-astronomical 
observations. 

Early attempts to look for this effect in galaxies at different redshifts 
failed since it was not possible in the 1960s to carry out measurements 
of angular sizes of galaxies so far away. After Hoyle’s proposal, several 
radio astronomers took up the challenge. The typical radio source is a 
linear structure and it is not so difficult to measure the angle subtended by 
it. In the radio case, however, obtaining redshifts directly is not possible: 
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Flux Density 



Flux Density 

Fig. 17.4. Two curves of source counts, one at 0.4 GHz and the other at 1 .4 GHz. 
AN are the numbers in a flux-density range (S, 5 + A5) while ANq are the num¬ 
bers in the same flux-density range in a static Euclidean spacetime. Notice that 
the counts go down to millijansky (top curve) and microjansky (bottom curve), 
respectively, compared with the early surveys which reached down to a few jan- 
sky. It is, however, not possible to relate these curves to a specific spacetime 
geometry. 

one must optically identify the source and then measure the redshift of 
the optical counterpart. After several studies in the radio no clear signal 
emerged since there were other, more dominant, effects whose presence 
would be significant. These effects included (1) a projection effect while 
measuring the angle subtended by a linear structure at the observer; 
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Fig. 17.5. The data points of 
angular size and redshift for 
radio sources with three 
model curves fitted to them. 


(2) dispersion in linear size, i.e., there is no standard yardstick to refer 
to; and (3) the ubiquitous possibility of evolution that would introduce 
z-dependent factors into the answer. 

To eliminate or at least to minimize the evolutionary effect of the 
intergalactic medium, Kellermann in 1993 suggested that the test be 
applied to the very tiny inner components of quasars which are seen 
through very-long-baseline interferometry. His preliminary studies gave 
a result in broad agreement with the Einstein-de Sitter model. However, 
a more thorough analysis of a sample of 256 ultracompact sources with 
redshifts in the range 0.5 to 3.8 by J. C. Jackson and Marina Dodgson 
showed that this model is in fact ruled out and that, for better fits, one 
needs to invoke the cosmological constant. Figure 17.5 shows the 9—z 
curves for three types of models fitted by them to the data. The points 
shown represent median values of data divided into 16 bins. 

To sum up, the 9—z test, to begin with, looked a simple and ele¬ 
gant way of checking on spacetime geometry, but observational realities 
turned out, once again, to be frustrating to the theorist! 

17.4 The age of the Universe 

The formulae in Chapter 15 give to as the age of the Universe accord¬ 
ing to the various Friedmann models. These formulae depend on two 
parameters, Ho and qo (or Qo), both of which have been discussed before. 
Additionally we have the choice of including the cosmological constant. 
We are now in a position to take a look at the question of whether the 
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Fig. 17.6. The age of the 
Universe plotted as a function 
of the density parameter Oo 
for various curves of given Oa- 
In general the age increases 
with O a - 


Friedmann age estimates are consistent with the various astrophysical 
estimates of the age of the universe. Figure 15.12, reproduced here as 
Figure 17.6, gives the range of values of the ages of the Friedmann mod¬ 
els for various values of these parameters, for purposes of comparison. 

At present there are two ways of estimating the ages of galaxies, 
both of which have been applied to our Galaxy. A primary requirement 
of consistency is, of course, that the age of the Universe in a Friedmann 
model must exceed the age of any object in it. 


17.4.1 Stellar evolution 

This method, applied to globular clusters in our Galaxy, is based on the 
principle that stars become redder and brighter when they leave the main 
sequence to become red giants. Since the red giant phase in the star’s life 
lasts a comparatively short time, say up to about 10% of the time the star 
spends on the main sequence, the turning point from the main sequence 
to the giant branch provides the cluster age to within 10% uncertainty. 

Let the cluster age, the time when the stars turn off from the main 
sequence, be denoted by t c x 10 9 years, and let Y and Z be the helium 
and metal abundances in the star at this stage. The calculations of stellar 
evolution then show that 


log t c = 1.035 + 2.085(0.3 - Y) - 0.03(log Z + 3). 


(17.10) 
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Thus the age depends critically on the helium abundance Y. Y can be 
estimated from a comparison of the time a star spends on the horizontal 
branch with the time it spends on the red giant branch. If this ratio is R , 
then calculations show that 

Y = 03 - 0.39 log( 

where / = 2 if the stellar model takes account of semiconvection and 
certain other effects, whereas / = 1 if these effects are not taken into 
account. R can be estimated from the observed ratio of horizontal-branch 
stars and red giant stars in the cluster. 

Cluster ages deduced by this method fall in the range from ~13 x 
10 9 to ~18 x 10 9 years. 

1 7.4.2 Nuclear cosmochronology 

In 1960 F. Hoyle and W. A. Fowler first demonstrated how the relative 
abundances of radioactive nuclei of long lifetimes can lead to estimates 
of the age of our Galaxy. The method had already been used for esti¬ 
mating the age of the Solar System. For example, current observations 
of the abundance ratio 87 Sr/ 86 Sr plotted against 87 Rb/ 86 Sr in various 
Solar-System materials (such as meteorites) give its age accurately as 
ts — 4.54 x 10 9 years. 

As illustrated in Figure 17.7, the method of nuclear cosmochronol¬ 
ogy attempts to estimate the time elapsed before the Solar System was 
formed. According to this method, we start our nuclear clock at t = 0 
with the birth of the Galaxy. The stars evolve and the more massive ones 
become supernovae, which manufacture long-lived radioactive nuclei 
in the so-called r-process (the rapid absorption of neutrons by heavy 
nuclei). The rate at which this process goes on is denoted by a function 


(17.11) 



Fig. 17.7. Three time spans 
need to be estimated in order 
to estimate the age of the 
Galaxy. First we need the time 
T spent in the r-process to 
manufacture long-lived 
radioactive nuclei. Then one 
needs the isolation time A, 
which can be estimated 
(together with T) from data 
on abundances of radioactive 
isotopes. Finally t 5 is estimated 
from radioactivity data of 
Solar-System material. 


t= 0 


T T+ A 
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p(t), which declines to negligible value at t = T. Between this epoch 
and the formation of the Solar System there occurs a short time gap A, 
known as the isolation time, during which we may ignore nucleosynthe¬ 
sis, in particular the r-process. Thus the total nuclear age of the Galaxy is 

fc^r + A + ts- (17-12) 

By techniques using data on radioactive isotopes and the observed 
abundances of certain long-lived nuclei, one can estimate T and A. 

The nuclear age so estimated lies in the range between 6 and 
20 billion years, the width of this range indicating the span of 
uncertainties in the various quantities used for determining the various 
time intervals. 

It is clear nevertheless, when these age estimates and the estimates 
from globular clusters are compared with those of Figure 17.6, that mod¬ 
els with h o = 1 and > 1 will find it very difficult to accommodate 
the above astrophysical estimates of the age of our Galaxy. In particular, 
the original inflationary model without X is ruled out because it predicts 
£2 0 = 1 unequivocally. One needs the cosmological constant. 

To make the problem easier for the conventional point of view, 
attempts are being made to see whether the stellar and radioactive ages 
can be brought down significantly. For example, if significant mass 
loss occurs during the main-sequence stage of stellar evolution then 
the time spent by the star on the main sequence is reduced. (For it 
started with higher mass and evolved faster.) By arguing in this way 
it may be possible to reduce the ages of globular clusters to values as 
low as (7-10) x 10 9 years. Likewise W. A. Fowler and C. C. Meisl have 
recalculated the nuclear age of the Galaxy using a time-dependent model 
for nucleosynthesis in which an early ‘spike’ is followed by a uniform 
rate of synthesis. They claim that the age then comes down to 11 ± 
1.6 (lcr) billion years. However, these efforts seem contrived at best. 

Observationally also, M. Feast et al., working with the Hipparcos 
data on stellar parallaxes, came up with a likely way of reducing stellar 
ages. They argued that because of these revised measurements there 
have been systematic increases in the revised stellar distances, so that 
the stellar luminosities are increased and the evolutionary time scales 
reduced. This could certainly help in reducing the gap between stellar 
and cosmological ages, but it is doubtful that the discrepancy can be 
completely eliminated in this way. 

For these reasons, the resurrection of the cosmological constant has 
helped the big-bang cosmology. For, as we saw in Chapter 15, the k-term 
can be suitably chosen to make the age of the Universe as long as we 
please. Figure 17.6, for example, shows how the age of the flat Friedmann 
model increases as the magnitude of the cosmological constant (L2 A as 
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defined in Chapter 15) is increased. However, the introduction of this 
constant increases the cosmological distances and thereby increases the 
probability of a distant light source being gravitationally lensed. On the 
basis of the frequency of lensed objects, upper limits have been placed on 
the dimensionless parameter £2 A : it is generally agreed that fi A cannot 
much exceed 0.75. 

17.5 Abundances of light nuclei 

It is generally recognized that nuclei with relative atomic masses A > 12 
are synthesized in stars through various processes discussed in theories 
of stellar evolution. The nuclei 6 Li, 9 Be, 10 B and possibly 11 B could be 
produced in galactic cosmic rays by the break-up of heavy nuclei as they 
travel through the interstellar gas. It is the lighter nuclei, in particular 
2 H, 3 He, 4 He and 7 Li, that appear to pose difficulties of production in 
stars in the amounts observed. Further, their abundances are such that 
they could have been produced in the big-bang nucleosynthesis. We will 
discuss here briefly the data on 4 He and 2 H, and what constraints they 
place on standard cosmology. 

17.5.1 4 He 

The observed helium abundances (always denoted by mass fraction T) 
in the Universe are quoted as lying in the broad range 0.13 < Y < 0.34. 
The scatter is wide because of the uncertainities of various observational 
estimates. Further, the estimate of primordial helium in the Sun at the 
time the Solar System formed ~4.54 x 10 9 years ago depends on the 
solar model and hence cannot be uniquely fixed. M. Peimbert, S. Torres 
Peimbert and J. F. Rayo have suggested that the break-down of Y at any 
location is as follows: 


Y = Y 0 + AY, 

To = 0.23 ± 0.02, (17.13) 

AT = (2.5 ± 0.5) x Z. 

where To is the primordial helium abundance, AT the stellar helium 
abundance and Z the abundance of heavy elements made by stars. Since 
Z < 0.02, AT < 0.06. 

There are occasional reports of low values of helium abundance and 
these need to be probed more deeply. For helium, once produced, is dif¬ 
ficult to destroy. In this sense, the theoretical primordial value of T men¬ 
tioned in Chapter 16 is expected to be the lower bound of T found today. 

We note that in the primordial picture To is relatively insensitive to 
/((] and fin. However, the introduction of new light leptons would push 
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up the neutron/proton ratio and hence the value of To. The following 
formula due to R. V Wagoner summarizes this result for the fraction rj 
defined in (16.51) exceeding ~10~ 5 : 

To = 0.333 + 0.0195 log + 0.380 logf. (17.14) 

Here the fraction £ = 1 if no new particles except those considered in 
Chapter 16 are assumed to be present in the early Universe. In terms 
of our notation of Chapter 16, this implies g — 9. If there are more 
particles, g -»• g + A g, where A g = Ag b + (7/8) Ag f , and 

§ 2 =1 + —. (17.15) 

g 

For Y 0 < 0.25 and £2q — 0.01, only one new neutrino is allowed, the 
so-called T-neutrino. If, however, Y 0 were as high as 0.28, up to four 
new leptons would be permitted by this constraint, whereas a value as 
low as 0.21 would land the standard big-bang model in real trouble. The 
smallest value of To allowed by the standard model is close to 0.236. 
Results from the accelerator experiments based on measuring the decay 
width of the Z° boson suggest that the number of neutrino species is 
3.01 ± 0.10. Thus there is a broad consistency between cosmology and 
particle physics. 

Observations of To are therefore of great importance and they con¬ 
tinue to be reported, as observers sharpen their spectroscopic diagnostics. 
These estimates may be seen as indicative only in placing constraints on 
the parameters of the standard cosmology. 

17.5.2 2 H 

The deuterium abundance, which we will denote here by JT( 2 H), was 
first measured in 1973, mainly from the Lyman-series absorption lines 
in the ultraviolet spectra of the bright stars observed with the Coperni¬ 
cus satellite. There have been several measurements of this important 
fraction. It is found that generally 

9 x 10“ 6 < X( 2 H) < 3.5 x 1(T 5 . 

Although a mean interstellar value of A( 2 H) ~ 2 x 10 —5 is often quoted, 
there is considerable variation in its value from cloud to cloud. It is not 
clear whether these variations are due to partial destruction of primordial 
deuterium through various processes. It has to be destruction, since so 
far no satisfactory stellar scenario for production of deuterium is known. 
Thus the primordial value would correspond to the upper end of the range 
of observations. At least we expect it to exceed ~ 2 x 10” 5 . (Contrast this 
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situation with that for 4 He, for which there is no destruction mechanism 
but processes of production exist in stars. ) 

Referring back to Figure 16.3, we see that a primordial abundance 
X( 2 H) > 2 x 1(T 5 implies that the baryonic density at present cannot 
exceed 4 x 1 O' 31 g cm~ 3 , which in turn sets an upper limit on the present 
baryon-density parameter (T2 B ) 0 : 

Ao(^b)o < 0.02. (17.16) 

It is interesting to note that in 1996, from measurements of deuterium 
abundance in clouds around a high-redshift quasar, Tytler, Fan and Buries 
placed limits on the baryon-density parameter: 

/! 2 (^b)o^ 0.024 ± 0.006. 

Thus, if matter in the Universe is predominantly made of baryons, 
the Universe must be open. Flowever, in modern thinking influenced by 
inflation the condition £2 = 1 must be satisfied. So one invokes non- 
baryonic dark matter and dark energy (a euphemism for the k-term). If 
all dark matter were baryonic, say distributed as black holes or burnt- 
out cores of stars or brown dwarfs, etc., the currently favoured cos¬ 
mological model fails, as mentioned in the concluding section of this 
chapter. 

There are, however, fine tunings involved here. For the restriction 
on baryon density implies a relatively tight relation between density and 
temperature, i.e., the constant of proportionality in the relation p B oc T 3 
in the relation (16.52) has to be correctly chosen for the model to give the 
right answer. One may therefore question whether this can be claimed 
as a deductive success of the big-bang cosmology. 


17.6 The microwave background radiation 

Measurements of the microwave background radiation (MBR) occupy 
the centre stage of observational cosmology today. As mentioned in 
Chapter 16, following the finding of the radiation background by Penzias 
and Wilson, the background has been assumed to be a relic of the early 
Universe. Following this interpretation, it has been probed by various 
teams of observers to obtain its spectrum, polarization, anisotropy and 
angular power spectrum. These observational details and the subtleties 
of their interpretation within the framework of a ‘standard’ model of 
the Universe are not appropriate in a text on general relativity. We will 
therefore only briefly summarize some relevant studies. 
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Fig. 17.8. The COBE plot of 
the spectrum of the cosmic 

microwave background 17 . 6.1 The Spectrum 

radiation. 

To check the true blackbody character of the radiation, it is necessary to 
have detectors above the Earth’s atmosphere, since ground-based mea¬ 
surements do not reach the peak wavelengths of the expected blackbody 
curve due to atmospheric absorption. There were several early attempts 
using balloons and rockets. Some of these reported departures from 
the Planckian spectrum later turned out to be false alarms. The most 
accurate and exhaustive study was carried out in 1990 by the satellite 
COBE. 

This satellite was launched in 1989 and obtained a beautiful 
spectrum as shown in Figure 17.8. See Reference [68]. The COBE 
measurements gave a very precise Planckian spectrum with a blackbody 
temperature of 


T 0 = 2.735 ± 0.06 K. (17.17) 

The overall sensitivity and accuracy of the experiment made it clear that 
some of the earlier claims of significant departures from the Planckian 
spectrum at high frequencies were erroneous. Indeed, even laboratory 
experiments are not known to produce a Planckian spectrum of this 
level of accuracy. 
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1 7.6.2 The angular power spectrum 

When we look at the distribution of a physical quantity across the celes¬ 
tial sphere, its anisotropies can be best described with the help of spher¬ 
ical harmonics. The physical quantity describing the anisotrophy of the 
MBR is its temperature '7(6, <p), written as a function of two spherical 
polar coordinates (such as declination and right ascension). We may 
accordingly write 


A T(0,<t>) 
T 


oo m=l 

EE U/m 7/ m (6, <p( 

. ;=1 m=—l 


(17.18) 


The sum over / begins with 1 instead of zero, for the zeroth perturbation 
is isotropic over the whole sky, and can be absorbed into T. The / = 1 
term is the so-called dipole-anisotropy term, which, as we shall see, 
arises from the motion of the Earth relative to the rest frame of the 
MBR. Henceforth we will not include this term also in the above series. 
The next, 1 = 2, mode is the quadrupole mode. 

The angular power spectrum is specified by quantities Cj defined by 


C, = 


(17.19) 


where the averaging is with respect to all realizations of the sky and 
summed over all m. Thus each C/ tells us the relative strength of the /th 
harmonic in the overall distribution. 

In general we will be interested in looking at A T/T over a certain 
angular scale d. Thus, if we take two directions denoted by unit vectors 
ei and e 2 enclosing an angle d between them, we get 


ei-e2 = cosd. (17.20) 

Now we define the autocovariance function which tells us how 
the temperature fluctuations compare over directions separated by the 
angle d: 

C(#) = (^, ^), (17.21) 

which, for stationary fluctuations, can be expressed in the form 

1 -°°^ 

C(d) = — V(2 / + l)C,P,(cos &). (17.22) 

47T 

1=2 

Suppose that, from observations of a single sky, we have obtained 
the estimate of the autocovariance function as C(d): 



324 Observational cosmology 


Fig. 17.9. The COBE map of 
small-scale anisotropy of the 
cosmic microwave 
background radiation. 


Fig. 17.10. The power 
spectrum of anisotropies of 
the radiation background. 



where the are determined from a single observation of the sky. In 
this case one needs to estimate the cosmic variance of the quantity C(i7). 
This can be shown to be 

<|CW - cm 2 ) = [^) 5j2/ + 1 )Cf P 2 (cos &). (17.24) 

1=2 

The first evidence of small-scale anisotropy came with the COBE 
satellite in 1992. (See Reference [74].) The COBE map of the sky is 
shown in Figure 17.9. 

Later a more detailed angular power spectrum along the above lines 
was obtained by the Wilkinson Microwave Anisotropy Probe (WMAP) 
satellite. A simplified WMAP power spectrum is shown in Figure 17.10. 

In practice the details are considerably intricate when one attempts 
to extract the signal from the actual sky data. We will not go into 
those details here. We point out, however, that the Legendre polynomials 
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P/icos iT) contain the following information: the typical angular scale 
of anisotropy corresponding to the index / is of the order of 180°/(jr/). 

The value of these measurements lies in constraining the theories 
of structure formation and through them the cosmological parameters. 
It is too early to say what the final picture is going to be like. We will 
content ourselves with listing a few possible causes of anisotropies of 
the MBR so that their signals may be looked for in such measurements. 
The smallness of angles implies that we are looking at higher harmonics 
/ in the range ~10 to ~10 4 . 

The Sachs-Wolfe effect measures the metric fluctuations near the 
last-scattering surface. For example, if there is inhomogeneity of matter 
(clumping/voids) in a given region, this would lead to fluctuation of ga 
from the homogeneous Robertson-Walker form. In Newtonian terms we 
may argue that the photons making up the radiation background come 
out of different potential (<p) wells, and this would produce a change of 
energy, and hence of T, given by 


AT 

T 


energy 


5c p 


(17.25) 


In addition to this there is time dilatation, so that the photons emerging 
from a potential well are delayed in relation to surface photons and 
therefore encounter the scale factor S at a later epoch. For the Einstein- 
de Sitter universe S oc t 2/3 and the fluctuation in T is given by 


A 

7 


time delay 


6 S 
S 


2 6 1 
3 7 


2 5 <p 

3 7 "’ 


(17.26) 


because the gravitational redshift produces the above time delay. On 
adding the two effects we get 


AT 

~Y 


AT 

~Y 


energy 



time delay 


1 6i -p 

3 7 "' 


(17.27) 


In addition to this there can be tensor fluctuations, which will pro¬ 
duce small contributions to A T/T. Since these fluctuations are associ¬ 
ated with time-dependent changes in the metric tensor they are essen¬ 
tially caused by gravitational waves. Some inflationary models predict 
gravitational-wave-type fluctuations, which are potentially detectable. 

The Sunvaev-Zel dovich effect suggests that the photons of the MBR 
entering a cluster with hot gas will be ‘kicked upstairs’ to higher (X-ray) 
energy by the Thomson scattering from high-energy electrons. Thus, if 
observed in the direction of the cluster, we should find a drop in the 
intensity of radiation. A crude approximation modelling the cluster as 
an isothermal sphere of radius R c gives a fractional drop in the MBR 
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temperature as 


AT 4 R c n t kT t Oj 

T m e c 2 


(17.28) 


where n e is the electron density in the cluster, T e the electron temperature, 
m e the electron mass and er T the Thomson-scattering cross section. So 
far there have been several claims of positive detection of this effect, 
in clusters ranging up to redshifts >2. This not only shows that the 
MBR extends that far, but also, by giving an estimate of R c , enables a 
determination of Hubble’s constant. The values of Hq determined in this 
way are of the order of <40 km s _1 Mpc -1 , i.e., much lower than their 
standard measurements. 

Clearly this effect, though not directly linked with the formation of 
large-scale structure, is nevertheless a useful tool for cosmologists. 

Sakharov oscillations constitute another measurable effect of MBR 
anisotropy, through velocity effects from acoustic oscillations of pertur¬ 
bations inside the horizon at the last scattering surface. These material 
oscillations lead to fluctuations of photons and their temperature, both 
being related to the wavelength of oscillations. So one may see periodic 
behaviour, showing a peak in the C; coefficients of the power spectrum 
estimated at 

/peak «200Q 0 *. (17.29) 


The announcement of the detection of such a peak (incongru¬ 
ously called the ‘Doppler peak’, since oscillations of matter rather 
than velocities are responsible for the effect) was made in 2000 by the 
‘BOOMERANG’ (Balloon Observations Of Millimetric Extragalactic 
Radiation ANd Geomagnetics) group of experimentalists. They found 
a peak amplitude A^oo = (69 ± 8) p.K at / pea k = (197 ± 6). The group 
in fact measured the angular power spectrum at / = 50 to 600. 

Following COBE and WMAP, a more ambitious statellite-borne 
experiment, viz. the PLANCK, has been in preparation. The ESA’s 
Planck Surveyor will measure the radiation at frequencies in the range 
30-100 GHz with the Low Frequency Instrument and in the range 100— 
190 GHz with the High Frequency Instrument. The expected resolution 
is ~10 arcmin with sensitivity for AT/ T ~ 2 x 10 -6 . 

The interest of cosmologists has now shifted towards understanding 
more (smaller) peaks of the power-spectrum curve occurring at higher 
frequencies, as well as the small degree of polarization found in the 
radiation. The WMAP team was the first to report this feature and it is 
hoped to measure it more accurately through later surveys. 

To summarize, the MBR is being looked upon as a mine of informa¬ 
tion by big-bang cosmologists. Since the MBR is regarded as a relic of 
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the early Universe, at least dating back to the last-scattering surface, its 
spectrum and anisotropies should contain valuable information about the 
past developments of the Universe, much as the archaeological remains 
at a site contain information about its past history. 


17.7 Dark matter and dark energy 

17.7.1 Spiral galaxies 

The best handle on the mass contained in a typical spiral is given by its 
rotation curve. Figure 17.11 illustrates the principle by means of a flat, 
disc-shaped object representing a circular distribution of stars moving 
round a common centre C. The rotation velocity v of a star S at a distance 
r from C is related (in an equilibrium distribution) to the gravitational 
force F r acting on S towards the centre: 

v 2 

m — = F r . (17.30) 

r 

Therefore, if we have v as a function of r, we get F r as a function 
of r. Then, by Newton’s law of gravitation (which is applicable here 
because the gravitational fields are weak), we can determine the mass 
distribution. For example, if most of the mass were concentrated in the 
nuclear region around C, we would have F r oc r~ 2 and v oc r~ l/2 . The 
light distribution across a spiral galaxy does suggest the above to be a 
good approximation. However, in actual fact the rotation curve - the 
function v{r) - is flat for most galaxies. That is, after rising sharply 
outside the nuclear region, v first declines slightly and then remains 
constant, equal to vq (say). Moreover, this relation extends well beyond 
the visible disc. Figure 17.12 shows some examples. 

The implication of this result is either that there is more mass in the 
outer parts of the galaxy than is indicated by its luminosity distribution, 
or that Newton’s laws of motion and the inverse-square law of gravitation 
might not be valid over the Galactic distance range (a few kiloparsecs). 
Taking the former (and less radical) view, astronomers have estimated 
the masses of spirals. S. M. Faber and J. S. Gallagher have listed the 
rotation velocities and masses contained within the Holmberg radius 
(where the surface brightness drops to ~26.5»2 pg arcsec -2 ) for 39 spi¬ 
rals. Since the luminosities are also known, we can estimate the mean 
value of ij (the mass-to-light ratio in units of M Q /L Q ) for this sample. 
The result is 
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Fig. 17.11. A disc-like 
distribution as shown in 
the figure fails to produce a 
flat rotation curve as seen in 
Figure 1 7.12(b). 


77 = (9 ± l)/t 0 - 


(17.31) 
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Fig. 17.12. The rotation 
curve expected to be 
produced by a galaxy (a) and 
the observed ones (b), which 
are flat over a long distance. 



(a) 



17.7.2 Clusters of galaxies 

As early as 1933 F. Zwicky had pointed out what has now become well 
known as the missing-mass problem in clusters. The problem can be 
briefly stated as follows. If we estimate the mass of galaxies moving 
in one another’s gravitational field in a cluster, then the virial theorem 
gives the mass of the cluster in terms of the velocity dispersion and the 
effective mean radius: 

, R 

M=(v 2 )-. 


(17.32) 
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From observations of the velocity dispersion (i) 2 } 1 ^ 2 we can therefore 
estimate the total mass M in the cluster. This value comes out consid¬ 
erably higher than that estimated on the basis of mass/light ratios t]q of 
individual galaxies. That is, if we see n galaxies in the cluster and if the 
total luminosity in the cluster is L, then the mass in the cluster is LrjQ. 
Zwicky was the first to point out that 

L, la « M. (17.33) 

For the Coma cluster, for example, M/(Ltjq) ~ 300 (see Exercise 7). 

Typically one arrives at a cluster mass in the neighbourhood of 
10 15 /*o 1 M q . Observations suggest that there are about 4000 large clus¬ 
ters within a ‘local’ sphere of radius 600 h^ 1 Mpc. This leads to a mean 
density of matter in clusters of 

Pod « 4 x 10 31 //g g cm -3 . (17.34) 

The density estimated for galaxies is of the same order, although not 
all galaxies reside in clusters. The clusters have proportionately higher 
mass than the galaxies contained in them because the M/L ratio for 
them is as high as ~300/;o x Ms/Z©, about ten times higher than that 
for galaxies. This is why the clusters appear to require greater amounts 
of dark matter for their virial equilibrium. 

Observations of X-rays from clusters have indicated that the emis¬ 
sion is through bremsstrahlung from hot gas, and the amount of baryonic 
matter in the Coma cluster is not sufficient to account for the missing 
mass estimated by application of the virial theorem. If the ratio of bary¬ 
onic to total gravitating matter in the Coma is representative of the 
universal value, then the total density parameter £2o is constrained by 
the inequality 


£ 


0.15/!q 1/2 
1 + 0.55/zq /2 ' 


(17.35) 


With this type of inequality, it is clear (i) that, if the deuterium in 
the Universe were made primordially, then we cannot have the density 
parameter attain the upper limit with baryons alone; and (ii) that the 
Universe is open (k — — 1), unless there is a large quantity of dark 
matter residing outside the clusters. Already at this stage the known 
baryonic content of the cluster mass (M B ) as a fraction of the total cluster 
mass (Mot) threatens a contradiction with observations of deuterium 
abundance. For example, for the Coma cluster, we have 


M b 

M t ot 


0.01 + 0.05 hf 12 


(17.36) 
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If this ratio were universal, it would lead to a conflict with the 
deuterium-abundance constraint for £2o = 1. In fact, if h o = 0.65, say, 
then setting this ratio equal to 0.01 h^ 2 , for consistency with the deu¬ 
terium abundance, gives £2o ~ 0.23. Thus, if the universal value of 
were claimed to be unity (as originally required by inflation), then the 
conclusion has to be that baryons are selectively located in clusters, 
while the non-baryonic matter fills the intercluster space. This epicyclic 
statement could be avoided, if one admits to a low-density Universe. 

17.7.3 Dark energy 

We have already referred to the Type la supernovae in high-redshift 
galaxies providing a test of the Hubble relation. In section 17.1.1 we 
found that Einstein’s cosmological constant X had to be resurrected 
in order to explain the observed redshift-magnitude relation. Later it 
became apparent that this remedy was too simple and rather unsatisfac¬ 
tory on two counts. Firstly, if one assumes (as is natural) that the value of 
X observed today is a relic of the inflationary era, then a major difficulty 
arises. The inflation was driven by an effective X, some 108 orders of 
magnitude higher than the X observed today. In short, we want today a 
finely tuned relic X of magnitude ~ 10” 108 of the primordial inflation¬ 
ary X. Secondly, a constant X turns out to be inadequate to explain the 
z-m curve for supernovae up to redshifts ~1.6. One needs an epoch- 
dependent X, so an elaborate theoretical structure that goes well beyond 
Einstein’s simple modification needs to be created. 

Today this extra force is popularly known as dark energy and various 
theoretical models for it are being investigated. 

17.8 The standard model of cosmology 

The studies during the first decade of the twenty-first century have con¬ 
centrated largely on the microwave background and the observation of 
redshifts and apparent magnitudes of Type la supernovae. These studies, 
together with the constraints imposed by structure-formation scenarios, 
the age of the Universe, the abundances of light nuclei, the density of 
dark matter, etc., have led to the following breakdown of the matter- 
energy contents of the most favoured or ‘standard’ model of cosmology: 

£2 m = 0.04, ^nbdm — 0.23, Q a — 0.73. (17.37) 

The fact that these parameters can be quoted with very small error 
bars has led to the adjective ‘precision cosmology’ being applied to the 
standard model. Also, because several constraints are satisfied by these 
parameters, this approach is referred to as ‘concordance cosmology’. 
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Complimentary though these adjectives are, the present confidence 
of cosmologists in the standard model may turn out to be illusory. For, to 
begin with, these ‘omega’ values are not determined directly. Neither are 
the theoreticians able to identify the non-baryonic dark matter in terms of 
any known particle. The dark-energy part also rests on highly speculative 
physics, having moved away from the original cosmological constant of 
Einstein. In fact, if we take the Tast-scattering-surface’ interpretation 
of the microwave background, we find that no event in the Universe, 
prior to this epoch of redshift ~1000, could be directly observed. Thus 
we have to rely on indirect evidence in the form of relics of those early 
epochs - and interpretation of relics can very often be controversial. 
Certainly it is not unique. 

Likewise, the high-energy physics on which the properties of the 
early Universe rest has not been tested beyond energies of the order of 
~1000 GeV Thus there is a big gap between what has been tested and 
verified and what is uncritically assumed, the latter being twelve orders 
of magnitude in energy above the former. A key feature of the standard 
model, inflation, also rests in the speculative era. As mentioned earlier, 
the inflationary model has not yet been obtained as an exact solution of 
the field equations with matched boundary conditions, like the Kerr and 
Schwarzswchild solutions are. When stellar astrophysicists were faced 
with the possibility of stars existing as very compact balls filled with 
neutrons, the so-called neutron stars, they spent considerable research 
effort trying to understand the state of matter at densities ~10 15 g cm~ 3 . 
Big-bang cosmologists have not spent any part of their time worrying 
about matter densities as high as ~10 50 g cm -3 , that existed at the GUT 
epoch. 

These are some of the reasons why one needs to be cautious 
about any conclusions drawn from such early epochs. Before modelling 
the Universe in such extreme conditions, there is a need to examine 
the theoretical foundations of relativity, to get a feel for the quantum 
theory of gravity and to clarify the uncertainties existing in some of the 
phenomena observed in extragalactic astronomy. In the following, final, 
chapter we discuss briefly some of these frontier areas. 


Exercises 

1. Suppose the intergalactic medium produces an absorption cross section k(X) 
per unit mass at wavelength X. Show that the increase in apparent magnitude 
of a galaxy of redshift z\ in the steady-state Universe due to this process is 
given by 


Am = 2.5 log 10 e ■ t(zj, X 0 ), 
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where X 0 is the wavelength of observation and 



where po is the density of absorbing material. (For the steady-state Universe 
assume the de Sitter line element and a constant density of matter.) 

2. If the luminosity of a galaxy seen at the epoch t is related to the present epoch 
to by the formula 

L(t) = L(to)(jj, 

where a = constant, calculate the change in the apparent magnitude produced 
by this effect for a galaxy of redshift z in the Einstein-de Sitter cosmology. 

3. In a globular cluster the metal content Z ~ 10 -3 and the ratio of horizontal- 
branch stars to red giants is 0.9. Show that in the / = 1 model the age of the 
globular cluster is about 11.9 x 10 9 years, whereas in the / = 2 model it is 
increased to around 2.0 x 10 10 years. 

4. Show that in a disc-shaped galaxy with surface density a(r ) oc r -1 one gets 
flat rotation curves 

v 2 — 2 tt Gra(r) = constant. 

5. Suppose I(X) oc X 2 in the range 2500 A < X < 5000 A. A galaxy of redshift 
0.5 is being observed in a wavelength band centred on 5000 A. Another galaxy 
of redshift 0.7 is also observed at 5000 A. Show that the AT-terms for the two 
galaxies will differ by ~ 0.41"'. 

6. A radio galaxy of redshift z = 0.1 has a spectral function oc v -1 and a lumi¬ 
nosity oflO 44 ergs -1 over the frequency range 150 MHz < v < 1500 MHz. For 
ho = 1 show that the flux density of the galaxy is ~350 Jy at 1000 MHz and 
~1750 Jy at 200MHz. (Neglect any cosmological effects.) 

7. In the Coma cluster of galaxies the observed velocity dispersion is ~861 
km s -1 , while the radius of the cluster is ~4.6/7q 1 Mpc. Show that the cluster 
mass given by the virial theorem is ~1.5 x IO^/Zq'Mq. The total luminosity 
of the cluster is estimated at ~7.5 x 1 0 12 Aq - 2 l 0 . Show that the mass/light-ratio 
parameter rj for the cluster is ~300/; 0 . 

8. Let f(L)dL denote the number of radio sources per unit volume in the 
luminosity range (L, L + dL). Suppose that for small redshifts the plot of log z 
against log L follows a straight line of slope 1 /2. Also assume that the number 
of points in equal intervals of log z is found to be constant. Using Euclidean 
geometry with distance oc z, deduce from these observations that f(L) oc L -2 5 . 
The survey is limited to sources with flux density exceeding So. 

9. The nucleus 87 Rb decays to 87 Sr with a half-life of r = 4.7 x 10 10 years. Let 
X(t) and Y(t ) denote the numbers of these nuclei in a meteorite at any time 
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t, so that the quantity X(t ) + Y(t) is conserved. Let t 0 denote the epoch when 
the Solar System was formed. Show that a plot of relative abundances X(t)/Z 
against Y(t)/Z, where Z is the number of 86 Sr nuclei (which remain unchanged), 
leads to a straight line whose slope is given by 

exp(Ato) - 1, 


where k = r *ln2. 

10. The ratio of occupied levels for J = 1 and J = 0 states for the CN molecule 
in the star C-Ophiuchi is 0.55 ± 0.05 and in the star C-Persei it is 0.48 ±0.15. 
The energy difference between the two levels is equal to kT, T = 5.47 K and the 
occupation weights are gi/go = 3. Deduce that the temperatures of the incident 
radiation lie in the respective ranges 3.22 ± 0.15 K and 3.00 ± 0.6 K. 

11. Suppose nuclear physics tells us that the age of a galaxy of redshift z = 0.5 
is 10 10 years. Use this information to set a limit on a function of H 0 and q 0 . If 
Hg l = 1.8 x 10 10 years, is q 0 = 1 possible? 

12. A radio source shows an angular separation of T of arc from a galaxy of 
redshift z = 0.44. Using the Einstein-de Sitter cosmology, estimate the linear 
separation of the radio source from the galaxy, assuming that source and galaxy 
are at the same redshift. (Use Hq 1 = 1.8 x 10 10 years.) 

13. Let cr(r) denote the surface mass density at a point P located at distance r 
from the centre of a thin, disc-shaped galaxy. Show that the gravitational force 
iy at P is directed towards the centre of the galaxy and is given by 




Chapter 18 

Beyond relativity 


We have come to the end of our account of the theories of relativity: 
special and general. While the former was briefly reviewed in the first 
chapter, we spent 16 chapters presenting the general theory from scratch. 
After preparing the background of vectors and tensors in the curved 
spacetime, we introduced the notions of parallel propagation, covari¬ 
ant differentiation, spacetime curvature and symmetries of motion. We 
then introduced physics through the notions of the action principle and 
energy-momentum tensors. 

This was the appropriate stage to introduce the basics of gen¬ 
eral relativity: the principle of equivalence, Einstein’s field equations 
and their Newtonian limit. Following these notions, we introduced the 
Schwarzschild solution and the various tests of general relativity, largely 
within the Solar System. We also discussed the budding field of grav¬ 
itational radiation and the attempts to detect it coming from cosmic 
sources. Our next topic was relativistic astrophysics, which deals with 
compact massive objects such as supermassive stars and black holes. We 
also briefly touched upon the very interesting topic of gravitational lens- 
ing. This was followed by a discussion of some highlights of relativistic 
cosmology. 

This presentation is indicative of the scope of general relativity. 
While it has created a niche for itself in theoretical physics as a remark¬ 
able intellectual exercise, it has also justified its status as the most effec¬ 
tive physical theory of gravitation by explaining and predicting several 
gravitational phenomena. At the same time we need to look ahead and ask 
whether the search for the ideal theory of gravitation ends here or whether 
there is scope for further improvement in its framework. Certainly, 
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Fig. 18.1. The Foucault 
pendulum at the 
Inter-University Centre for 
Astronomy and Astrophysics 
takes 75 hours to complete 
one rotation round the vertical 
axis. The suspended ball 
slowly changes its direction of 
oscillation as seen against the 
background of the floral 
design below. 


despite the successes of general relativity, we are still a long way from 
understanding gravity. When Newton was asked deep questions about 
the nature of gravity he replied Non fingo hypotheses} Here we discuss 
a few assorted ideas inspired by general relativity, or attempting to take 
it further. They neither present the last word nor claim to be exhaustive. 

18.1 Mach's principle 

There are two ways of measuring the Earth’s spin about its polar axis. By 
observing the rising and setting of stars the astronomer can determine 
the period of one revolution of the Earth around its axis: the period of 
23 h 56 m 4 s .l. The second method employs a Foucault pendulum whose 
plane gradually rotates around a vertical axis as the pendulum swings 
(see Figure 18.1). Knowing the latitude of the place of the pendulum, 
it is possible to calculate the Earth’s spin period. The two methods give 
the same answer. 

At first sight this does not seem surprising. Since we are measuring 
the same quantity, we should get the same answer regardless of the 
method used. Closer examination, however, reveals why the issue is non¬ 
trivial. The two methods are based on different assumptions. The first 
method measures the Earth’s spin period against a background of distant 
stars, whereas the second employs standard Newtonian mechanics in 
a spinning frame of reference. In the latter case, we take note of how 
Newton’s laws of motion are modified when their consequences are 
measured in a frame of reference spinning relative to the ‘absolute 
space’ in which these laws were assumed, by Newton, to hold. 

1 I do not frame hypotheses. 
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(b) 


Fig. 18.2. A schematic 
description of Newton's 
bucket experiment. The 
stationary bucket (a) hanging 
by a thread has the water 
level in it flat (and horizontal). 
However, if the bucket is 
twisted round the thread and 
let go, the twisted thread 
unwinds and makes the bucket 
spin. As the bucket spins 
rapidly, the water level in it 
becomes curved (b), rising 
at the rim and dipping at the 
centre. Newton argued that 
this experiment demonstrated 
rotation relative to absolute 
space. 


Beyond relativity 


Thus, implicit in the assumption that equates the two methods is the 
coincidence of absolute space with the background of distant stars. It 
was Ernst Mach in the last century who pointed out that this coincidence 
is non-trivial. He read something deeper in it, arguing that the postulate 
of absolute space that allows one to write down the laws of motion 
and arrive at the concept of inertia is somehow intimately related to the 
background of distant parts of the Universe. This reasoning is known as 
‘Mach’s principle’ and we will analyse it further. 

When expressed in the framework of the absolute space, Newton’s 
second law of motion takes the familar form 

P = mf. (18.1) 

This law states that a body of mass m subjected to an external force P 
experiences an acceleration f. Let us denote by E the coordinate system 
in which P and f are measured. This frame represents Newton’s absolute 
space. 

Newton was well aware that his second law has the simple form 
(18.1) only with respect to E and those frames that are in uniform motion 
relative to E. If we choose another frame E' that has an acceleration a 
relative to E, the law of motion measured in E' becomes 

P' = P-ma = mf'. (18.2) 

Although (18.2) outwardly looks the same as (18.1), with f' the 
acceleration of the body in E', something new has entered into the force 
term. This is the term —m a, which has nothing to do with the external 
force but depends solely on the mass m of the body and the acceleration 
a of the reference frame relative to the absolute space. Realizing this 
aspect of the additional force in (18.2), Newton termed it ‘inertial force’. 
As this name implies, the additional force is proportional to the inertial 
mass of the body. Newton discusses this force at length in his Principia, 
citing the example of a rotating water-filled bucket (see Figure 18.2). 

According to Mach, the Newtonian discussion was incomplete in 
the sense that the existence of the absolute space was postulated arbi¬ 
trarily and in an abstract manner with no reference to the distant stellar 
background. Why does E have a special status in that it does not require 
the inertial force? How can one physically identify E without recourse 
to the second law of motion, which is based on it? 

To Mach the answers to these questions were contained in the obser¬ 
vation of distant parts of the Universe. It is the Universe that provides a 
background reference frame that can be identified with Newton’s frame 
E. Instead of saying that it is an accident that Earth’s rotation velocity 
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relative to £ agrees with that relative to the distant parts of the Uni¬ 
verse, Mach took it as proof that the distant parts of the Universe must 
somehow enter into the formulation of local laws of mechanics. 

One way this could happen is by a direct connection between the 
property of inertia and the existence of the universal background. To 
see this point of view, imagine a single body in an otherwise empty 
Universe. In the absence of any forces (18.1) becomes 

mf = 0. (18.3) 

What does this equation imply? Following Newton we would conclude 
from (18.3) that f = 0, that is, the body moves with uniform velocity. 
But we now no longer have a background against which to measure 
velocities. Thus f = 0 has no operational significance. Rather, the lack 
of any tangible background for measuring motion suggests that f should 
be completely indeterminate. It is not difficult to see that such a result 
follows naturally, provided that we come to the remarkable conclusion, 
also possible from (18.3), that 


m = 0. (18.4) 

In other words, the measure of inertia depends on the existence of 
the background in such a way that in the absence of the background the 
measure vanishes! This aspect introduces a new feature into mechanics 
not considered by Newton. The Newtonian view that inertia is a property 
of matter has to be augmented to the statement that inertia is a property of 
matter as well as of the background provided by the rest of the Universe. 
This general idea can be identified with Mach s principle. 

Such a Machian viewpoint not only modifies local mechanics but 
also introduces new elements into cosmology. For there is no basis 
now for assuming that particle masses would necessarily stay fixed in 
an evolving Universe. This is the reason for considering cosmological 
models anew from the Machian viewpoint. Presented here are some 
instances of how various physicists have given quantitative expression 
to Mach’s principle and arrived at new cosmological models. 

Although Einstein himself was initially impressed by Mach’s argu¬ 
ments, he later came to discount them because they suggested action at 
a distance. For a historical review of Mach’s principle see the collection 
of articles edited by Barbour and Pfister [75]. 

Kurt Godel demonstrated in 1949 that spinning universes in general 
relativity do not subscribe to Mach’s principle. Godel’s model had the 
universe spinning so that the observer at rest in the local inertial frame of 
such a universe would see the distant parts of the universe rotating. This 
counter-example demonstrated that the basic argument on which Mach’s 
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principle is formulated cannot itself be guaranteed by general relativ¬ 
ity. Other such ‘anti-Machian’ solutions later emerged from relativistic 
cosmology and it became clear that one has to go beyond relativity to 
incorporate Mach’s principle. 

We briefly recall the twin paradox of Chapter 1. If twins A and B 
argue as to which of them is the inertial observer, we can now suggest a 
practical way of resolving the argument. The one who remains unaccel¬ 
erated relative to the frame provided by the distant parts of the universe 
is the inertial observer. 


18.2 The Brans-Dicke theory 

There have been attempts by later scientists such as Sciama [76], Brans 
and Dicke [77] and Hoyle and Narlikar [78, 79], which modified general 
relativity and hence cosmology to give explicit quantitative expression 
to Mach’s ideas. Of these we will refer to the Hoyle-Narlikar approach 
in Section 18.4.1. The Brans-Dicke theory played a very interesting role 
in offering alternative predictions of the Solar-System tests of gravity, 
which prompted an upsurge of experimental techniques to make accurate 
measurements for distinguishing between the predictions of this theory 
and general relativity. The action principle of this theory is given by 
replacing the Hilbert term in general relativity by 


J(<pR + a>ct> Wtlv'-?d 4 x. 


The parameter co distinguishes the Brans-Dicke theory from general 
relativity, with the scalar field cp playing the role of . By appropriate 
scaling, one can show that this theory approaches general relativity as 
co -» oo. The Solar-System tests have placed a lower limit of the order 
of ~3000 on this parameter. 

Nevertheless, the cosmological models emerging from the Brans- 
Dicke theory can still be significantly different from standard cosmology 
sufficiently early in the Universe. For example, the inflationary regime 
can be different because of the additional terms in the action. The idea 
seemed to solve some of the conceptual problems of the original infla¬ 
tionary model but ran into trouble because the distortions it produced in 
the cosmic microwave background were unacceptably high. Undeterred 
by these setbacks, the inflation enthusiasts explored a variation on the 
Brans-Dicke theme by adding higher-order couplings of the scalar field 
with gravity, which led to the notion of 'hyper-extended inflation ’. (See 
for example the paper by Mathiazhagan and Johri [80].) However, none 
of these ideas seem to have received much following in later years. 
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To summarize, considerations of the early and very early Universe 
could possibly probe the differences between general relativity and the 
Brans-Dicke theory further. Insofar as observations of relatively recent 
epochs are concerned, however, because of the largeness of &>, for most 
practical purposes the differences between the Brans-Dicke theory and 
general relativity are insignificant. 


18.3 Spacetime singularity and matter creation 

When the Friedmann models were finally recognized as providing the 
simplest models of the expanding Universe, one aspect of these models 
was somewhat disturbing - their origin in a spacetime singularity. At 
the beginning this was considered an anomaly; the choice of an excep¬ 
tional symmetry of spacetime (the Weyl postulate and the cosmological 
principle) was held responsible for the singularity. Thus, it was argued, 
the introduction of anisotropy in the form of shear and spin in the Uni¬ 
verse would remove the singularity. In this context, A. K. Raychaudhuri 
obtained a simple but elegant result that was to have a far-reaching 
effect on the issue of spacetime singularity [81]. Raychaudhuri showed, 
with the help of an equation determining the evolution of a volume 
element, that the introduction of spin goes towards removing the sin¬ 
gularity, whereas shear has the opposite effect. The irony was that one 
could obtain solutions with shear and no spin, but not with spin and no 
shear. So a demonstration of the avoidance of singularity remained an 
unattainable goal. 

The Raychaudhuri equation arises in relativistic cosmology when we 
look at the bundle of timelike geodesics defined by the Weyl postulate. 
If u‘ is the unit tangent to the geodesic, we define the spin-vorticity 
3-tensor for the cosmic fluid by co^ — \{u^ v — u v - 4l ). 

Writing the line element in the form 

ds’ 2 = At 2 + Igon At Ax' 1 + g^v Ax'* dx v , (18.5) 

where the geodesics are specified by x 1 ' = constant and t is the cosmic 
time, the (0,0) component of field equations in the case of dust of density 
p then becomes 

! = i(2« 2 -47r Gp-<P 2 ), (18.6) 

where Q 6 — —g and 

2 or = -g A '*g ffT 0 )juxGVr. 

<p 2 = ^' v gvag aX g^ - * . (18.7) 
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Fig. 18.3. The bundle of 
geodesics focusses in the 
future with its cross section A 
decreasing to zero. This effect 
was discussed in the context of 
spacetime singularity by A. K. 
Raychaudhuri. 


The </> term is identified with shear and it goes the opposite way (to 
the spin term) through promoting singularity by helping the scale of the 
cosmic volume, Q, approach zero. It vanishes when the expansion is 
isotropic. 

The Raychaudhuri equation can be stated in a slightly different form 
as a. focussing theorem. In this form it describes the effect of gravity 
on a bundle of null geodesics spanning a finite cross section. Denoting 
the cross section by A, we write the equation of the surface spanning 
the geodesics as / = constant. Define the normal to the cross-sectional 
surface by k t = df/dx‘. Figure 18.3 shows the geometry of the bundle. 

By invoking the analogue of the hydrodynamic conservation law, we 
deduce 


k'A,i = [k[j]A. (18.8) 

Additionally we also have from the null geodesic condition 

k'krj = 0. (18.9) 

Using a calculation similar to that which led to the geodetic deviation 
equation in Chapter 5, we get the focussing equation as 

1 d 2 s/A 1 , 

72 ^ = 2 ^- '*'■ (1810) 

where 

M 2 =\k i , m k i ’ m - l -[kf n f. (18.11) 

Equation (18.10) is similar to the Raychaudhuri equation with |er| 2 
being the square of the magnitude of shear. With Einstein’s equations, 
we can rewrite (18.10) as 

7i = ~ 4jtG (ji~ - k ‘ r - m 2 - ( 18 - 12 > 

For focussing of the bundle of rays we need A -» 0, so the right- 
hand side should be negative. This is helped by the shear term in the 
above equation, just as Raychaudhuri had found. The first term on the 
right-hand side of the focussing equation also has this property if 

(j lm - \gin.T^ k l k m > 0 . 

For dust we have T im = puiii m and this condition is satisfied with 
the left-hand side equalling p(ujk 1 ) 2 . (Remember that k t is a null vector, 
so g im k‘k m = 0.) Thus the normal tendency of matter is to focus light 
rays by gravity. 
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The singularity theorems of Penrose and Hawking [82] use this basic 
feature to state conditions that inevitably lead to a spacetime singularity. 
The condition of the positivity of the 7]v c term in the equation above 
plays a crucial role in general. We will not go into these details except 
to highlight this work as a field deserving further research. In particular, 
the positive-energy condition suggests that there may be non-singular 
spacetimes if it is violated and there are negative energy fields. We will 
now describe a line of thinking in which such fields are used to avoid 
the initial (or any ) singularity. 

18.4 The quasi-steady-state cosmology 

In the late 1940s, H. Bondi and T. Gold [83] and F. Hoyle [84] inde¬ 
pendently proposed the steady-state cosmology as an alternative to the 
standard cosmology. The cosmology envisaged the Universe as described 
by the Robertson-Walker line element, with k — 0 and Sit) — exp (Ht), 
where the Hubble constant H is strictly a constant. In fact the name 
‘steady state’ implies that the spacetime has a timelike Killing vector, 
and that physical conditions at any epoch t are the same. One con¬ 
sequence of this requirement is that as the Universe expands there is 
creation of matter to keep its density p constant, the rate of creation 
per unit volume being 3 Hp. The cosmology thus has no singular epoch 
and no hot past. Bondi and Gold believed that the entire dynamics and 
physics of the Universe should follow from a single principle which they 
enunciated as the perfect cosmological principle. This principle takes 
the usual cosmological principle a stage further by additionally requiring 
homogeneity of the Universe with time. For, the authors argued, without 
such an invariance being guaranteed, one cannot be confident that the 
laws of physics known today had the same form at all times past and 
present. Without such a guarantee, one cannot interpret observations of 
the distant Universe unambiguously. Bondi and Gold called this model 
the steady-state model. Hoyle arrived at the same model by modifying 
Einstein’s field equations by adding terms that allowed for creation of 
matter. His approach had been more physical than philosophical and 
dictated by the requirement to understand the origin of all the matter 
observed in the Universe today. 

In the 1950s and early 1960s, the steady-state cosmology provided a 
stimulus to observers to stretch the limits of their observing technology 
to test the predictions of this model and to distinguish it from the standard 
cosmology. In the end most cosmological tests involving discrete source 
populations turned out to be inconclusive, as it became clear that one 
first needs to understand the various sources of observational errors as 
well as the physical properties of the sources used for the tests before 
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drawing unequivocal conclusions. Nevertheless, the steady-state theory 
failed on two important counts, namely providing a setting for the origin 
of light nuclei (especially deuterium and helium) and explaining the 
origin of the microwave background. 

The theory, which had been abandoned in the 1970s and 1980s, 
was revived in a new form by F. Hoyle, G. Burbidge and J. V Narlikar 
in 1993 [85] and developed to some level of detail in a number of 
papers. These details include the basic rationale and genesis of the idea, 
its astrophysical and observational consequences, a formal theoretical 
structure, cosmological models and a model for structure formation. (For 
these details in one place, see Reference [86].) We briefly summarize 
and assess this quasi-steady-state-cosmology (QSSC) model, since we 
feel that, although it has not been studied in anything like the detail 
one finds for the standard model, at present it is the only available 
alternative to which the same observational and theoretical criteria for a 
viable cosmology can be applied. 


1 8.4.1 Broad features of the QSSC model 

The theoretical structure of this cosmology and its relationship to obser¬ 
vations are summarized below. 

(1) The cosmology is based on the Machian theory of gravitation first 
proposed by Hoyle and Narlikar in 1964 [78, 79]. The theory of Hoyle 
and Narlikar starts with the premise that the inertial mass of any particle 
is determined by the surrounding Universe. In field-theoretical language, 
the inertia is a scalar field whose behaviour is determined by an action 
principle. As shown later by Hoyle et al. [87], the theory permits broken 
particle world lines, i.e., creation and destruction of matter. In the cos¬ 
mological approximation of a well-filled Universe, the field equations 
become. 

. SnG , , 

Rik ~ \gaR + ^gik =— ^-[T ik - f(C,C k - \g ik C l C,)l (18.13) 

where C is the scalar field representing the inertial effect associated with 
the creation of a new particle, and a consequence of Mach’s principle is 
that the constants in these equations can be related to the fundamental 
constants of microphysics and the large-scale features of the Universe. 
Thus, restoring c for the sake of units, we have 



Here m P is the mass of the basic particle created and JV the number of 
such particles in the observable Universe. From the above it is easy to 
identify m p with the Planck mass, which makes Af of the order of 10 60 
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and X of the order of 10 -56 cm~ 2 . Notice that its sign is negative, i.e., 
it represents an attractive rather than a repulsive force. The coupling 
constant / is positive, thus requiring the C-field stress and energy 
to act repulsively on matter and space because of the explicit minus 
sign in the stress tensor. It is assumed that the creation of a particle of 
mass m P is possible, provided that a ‘creation threshold’ is attained by 
the ambient C-field, namely, C\C l = m\. At the time of creation the 
momentum of the new particle is balanced by C,. In such cases, we may 
have situations with T ,k f 0, although the divergence of the right-hand 
side overall is zero. 

(2) The cosmological models in this theory are driven by the cre¬ 
ation process, and it is argued that the creation process does not occur 
uniformly everywhere, but preferentially near massive objects that have 
collapsed to something close to the state of a black hole. This is because 
the gravitational field in the neighbourhood of such an object is high and 
permits the local value of CjC 1 to rise high enough to reach the creation 
threshold. The Planck particle so created is assumed to be unstable, how¬ 
ever, and decays, within a time scale of the order of 10~ 43 s, into baryons, 
leptons, pions, etc. along with the release of a substantial amount of 
energy. The creation of matter is compensated for by the creation of the 
C-field, and, as the strength of the field rises, its repulsive effect makes 
the space expand rapidly (as in the inflationary scenario), thus causing 
an explosive ejection of matter and energy. The origin and outpouring of 
very high energy in quasars, active galactic nuclei, etc. are claimed by 
the QSSC to be phenomena representing minicreation events like these. 

In a typical minicreation event, the central object itself may break up 
as its gravitational binding is loosened by the growth of the negatively 
coupled C-field. Thus it may also happen that the central object may eject 
a coherent piece along the line of least resistance. The QSSC authors 
argue that some of the ‘anomalous redshift’ cases (see [88, 89]) can be 
explained by invoking this phenomenon. What are these cases? Typically 
in such a case one sees two objects, e.g. a quasar and a galaxy, say, very 
close to each other but with very different redshifts. The probability of 
their being projected close to each other by chance is very low. Are they 
near neighbours? If so, their different redshifts violate Hubble’s law. Two 
cases of such anomalous redshifts are shown in Figure 18.4. 

(3) The cosmological solutions are driven by the minicreation events, 
each of which produce local expansions of space. The averaged effect 
of a large number of such events over a cosmological volume can be 
approximated by a homogeneous and isotropic solution of the field 
equations. As in the standard cosmology, the Robertson-Walker line 
element can be used to describe such a spacetime. The work of Sachs 
et al. [90] has shown that the generic solution for all three cases, 
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Fig. 18.4. Two typical cases 
of anomalous redshifts. In 
(a) we have two quasars of 
redshifts 0.4 and 0.65 aligned 
across an NGC galaxy of 
redshift 0.002. The precise 
alignment and close proximity 
suggest ejection of quasars, 
which are X-ray sources, by 
the galaxy which also houses 
an X-ray source. In (b) we 
have a big galaxy, NGC 7603, 
apparently connected by a 
filament to a companion 
galaxy. The redshifts of the 
two galaxies are 0.029 and 
0.056, respectively. In either 
of cases (a) and (b), for 
maintaining consistency with 
Hubble's law one has to 
assume all these 
configurations to be 
projection effects with 
probabilities as low as 10~ 4 . 


k=+ 1,0,—1, has a long-term steady expansion interspersed with 
short-term oscillations. For example, the scale factor for k — 0 is given 
by 

S(t ) = exp(f/P)[l + ij cos r(f)], 

where 0 < i] < 1, so that 5 oscillates between two finite values and 
r (t) is almost like t during most of the oscillatory cycle, differing from 
it mostly during the stage when S is close to the minimum value. The 
period of oscillation Q is small compared with P . The QSSC is therefore 
characterized by the following parameters: P, Q, ij and z max , the max¬ 
imum redshift seen by the present observer in the current cycle. Sachs 
et al. [90] took P — 20Q, Q = 4.4 x 10 10 years, rj = 0.8 and z max = 5 
as an indicative set of values. The QSSC workers have argued that the 
cosmology is by no means tightly constrained around these values by 
the various cosmological tests. Figure 18.5 illustrates one such case. 

(4) Flow is the cosmic microwave background (CMB) produced in 
this model? The QSSC oscillations are finite, with the maximum redshift 
observable in the present cycle at ~5-6. Thus each cycle is matter- 
dominated. The radiation background is, however, maintained from one 
cycle to next. Thus, from the minimum scale phase of one cycle to 
next, its energy density is expected to fall by a factor exp(—4 Q/P). 
This drop is made up by the thermalization of starlight produced during 
the cycle. Thus, if e is the energy density of starlight generated in a 
cycle and M max is the energy density of the CMB at the start of a cycle, 
then € = 4u mm Q/P. If the cycle minimum occurred at redshift z max , 
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Fig. 18.5. The scale factor of 
a typical model of the 
quasi-steady-state cosmology. 
See the text for details. 


then the present CMB energy density would be Pe/\AQ(\ + z max ) 4 ]. By 
substituting the values of e, P, z max and Q we can estimate the present- 
day energy density of the CMB and the result agrees well with the 
observed value of ~4 x 10' 13 erg cm' 3 corresponding to a temperature 
-2.7K. 

How is the starlight thermalized? Consider the following scenario. 
The cooling of metallic vapours produces whisker-like particles of 
lengths ~0.5—1.0 mm, which convert optical radiation into millimetre- 
wave radiation. Such whiskers typically form in the neighbourhood of 
supernovae (which synthesize and eject metals), and are subsequently 
pushed out of the galaxy through the pressure of shock waves. It can 
be shown that a density of MO' 35 g cm' 3 of such whiskers close to the 
minimum of the oscillatory phase would suffice for thermalization of 
starlight. 

While the thermalized radiation from previous cycles will be very 
smoothly distributed, a tiny fraction (MO' 5 ) will reflect anisotropies on 
the scales of rich clusters of galaxies in the present cycle. The angu¬ 
lar scales for this anisotropy will be of the order of M/100, —1/250 
for clusters and superclusters, corresponding to /-values MOO—200. 
A recent comparison with the WMAP data shows an acceptable fit to 
observations of the power spectrum of CMB fluctuations [91]. 

(5) In a recent paper Burbidge and Hoyle [92] argued that a case 
may be made for all isotopes having been made in stars, including 
the light ones generally assumed to be of primordial origin. They 
showed that possible stellar scenarios exist for production of these 
nuclei. 

(6) The QSSC has been applied to the redshift-magnitude relation 
obtained by using Type la supernovae. Narlikar et al. [93] have reexam¬ 
ined the problem in the context of the QSSC for the data used for fitting 
the standard models, with or without the cosmological constant. As we 
have seen, the QSSC requires intergalactic dust in the form of metallic 
whiskers. This whisker population acts to produce further absorption in 
the light from distant galaxies and supernovae therein. Taking this effect 
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into account, the QSSC model can be fitted to data by taking the dust 
density as a free parameter. The optimized fit turns out to be quite sat¬ 
isfactory. Also the optimum whisker density turns out to be in the right 
range for thermalization of starlight into the microwave background. 
Thus there is an overall consistency in the parameters used. 

(7) Preliminary work on structure formation has shown that the pat¬ 
tern of filaments and voids for clusters can be generated by minicreation 
events. Assuming that creation of new galaxies takes place selectively 
near highly dense regions, and that too at the maximum density phase 
of a typical QSSC cycle, one can simulate the resulting distribution for 
10 5 —10 6 galaxies on a computer. It is observed that an initial random 
distribution changes over into a supercluster-void distribution after a 
few cycles. The two-point correlation function of the galaxies created 
also tends to a power-law form with the index —1.8, as observed. 
See Reference [86]. 

While the various physical and astrophysical aspects of the QSSC 
have not been studied in anything like the depth to which the standard 
cosmology has been probed, these preliminary studies suggest that the 
cosmology, certainly as an alternative to the currently favoured option, 
deserves more critical attention than it has so far received. 


18.5 Quantum gravity 

Experience in the rest of physics (except gravity) shows that the classical 
equations of fields and particles break down at the microscopic level, to 
be replaced by the notions of quantum theory. When does one make a 
transition from the classical to the quantum version? A ‘rule of thumb’ is 
to evaluate the action A over the characteristic 4-volume for the problem 
and compare with h. If the ratio A/h is much larger than unity then the 
problem can be adequately handled by classical physics. If the ratio 
is comparable to unity then we need the quantum version to solve the 
problem. 

How does this prescription work for gravity? A look at the action 
principle (8.7) shows that the limit sought above can be obtained by 
equating the gravitational action 

■ 4 *-lSc / S ' /= * d41 |18J41 

to Planck’s constant. For A g ^> h we can trust our classical description 
of spacetime geometry, whereas for A g h a quantum description of 
cosmology is indispensable. But to evaluate A g we need V, the 4-volume 
of the spacetime manifold. 
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In the big-bang model we take V as the 4-volume enclosed by the 
particle horizon and bounded by the time span of the Universe. Thus at 
any epoch t for k — 0, S oc t x/2 , the particle horizon is defined by 


rS = 2 ct. 


For S oc t 1/2 , R — 0 and so A g — 0. However, this happens because 
the trace of T' k is zero in the early Universe. As an order of magnitude 
estimate we may take 7?° instead of R in the computation of A g : .Kg 
gives us an idea of how the geometrical part of the action changes with 
time. For S oc t x/2 , = 3/4(c 2 ! 2 ). Thus up to the epoch t 


A g 


c 4 /'' 

16 nG J 0 


By equating A g to h we get 


3 4n 
4 c 2 t\ 3 


{2ct\f d?i 




This time span is called the Planck time. No classical discussion of 
gravity can be pushed to time scales t < tp. We have already encountered 
very short time scales of the order of 10 s in Chapter 16 when GUTs 
operated. The above quantum-gravity time scale corresponds to an even 
higher energy of E ~ 10 19 GeV. This energy, as seen from (18.15), is 
simply ~h/tp. 

Thus the present discussions of GUTs and cosmology already take 
us right up to the Planck epoch. Whether the Universe did indeed have 
a spacetime singularity at t = 0 should be determined not by classical 
general relativity but by an appropriate theory of quantum gravity. 

There are several conceptual and operational problems on the way to 
a quantum theory of gravity, if we are to look for a quantized version of 
general relativity. To begin with, the non-linearity of relativity makes the 
methods which work for standard ‘flat-space field theories’ inapplicable 
here. Secondly, in relativity spacetime geometry and gravity are inextri¬ 
cably mixed and so one is not sure what is to be quantized. Thirdly, in 
flat-space quantizations, inclusion of the dynamical nature of geometry 
is not required: here it is an essential feature of the problem. 

It is not surprising therefore that the quantized version of general 
relativity has not yet emerged. At present the goal of having a working 
theory of quantum gravity seems far away. The different approaches that 
have been tried in order to quantize gravity do not agree on the answer 
to the following question: did the Universe have a singular epoch? A 
simple approach based on conformal fluctuations suggests that, if we 
include quantum fluctuations of homogeneous and isotropic universes, 
then the spacetime singularity would ‘most probably’ be averted. The 
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probability here is in the sense of quantum mechanics. An event is most 
probable if the quantum probability of its not happening has measure 
zero. The result can in fact be stated in a more general form proved by this 
author, namely that, if one considers most general quantum conformal 
fluctuations of a classical singular cosmological solution, then, most 
probably, singularity is not present in these fluctuations [94], 


18.5.1 Radiating black holes 

The quantum theory of gravity being recognized as a long-term 
project, work has been proceeding in the meantime on a simpler notion, 
that of field-theory quantization in curved spacetime; and it is producing 
some interesting (and unexpected) results. Here we present in brief the 
original example of this approach applied to black-hole physics. 

As the name ‘black hole’ implies, we do not expect any radiation to 
come out of such an object. For a spherical object of mass M, the black- 
hole condition is reached when its surface area equals 4 t tR*, where R s , 
the Schwarzschild radius, is given by 


R s = 


2GM 


(18.16) 


No material particle or light signal emitted from R < R s can go into the 
region R > R s : at least, this is what classical general relativity tells us. 

We saw in Chapter 13 that the behaviour of black holes is in many 
ways analogous to thermodynamics. Thus the area of the horizon is like 
entropy and surface gravity like temperature. Can this analogy be pushed 
further, closer to becoming reality? If so, temperature implies radiation 
and the black hole is expected to radiate. This seemed a very unlikely 
conclusion given the physical nature of black holes. 

Nevertheless, in 1974 Stephen Hawking [95] made the remarkable 
suggestion that a black hole can radiate. Hawking’s calculation went 
beyond classical physics: it considered what happens when any field (for 
example, the electromagnetic field) is quantized in the spacetime con¬ 
taining a black hole. As we have already seen, the quantum-mechanical 
description of vacuum is much more involved than the classical descrip¬ 
tion, which simply states that a vacuum is empty. According to quantum 
field theory, the vacuum is seething with virtual particles and antiparti¬ 
cles whose presence cannot be detected directly. Their interference with 
physical processes in spacetime can, however, lead to detectable results. 
Hawking found that one such result when considered in the spacetime 
outside a black hole is that an observer at infinity sees a flux of particles 
coming out from the vicinity of a black hole. We will not go through 
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Fig. 18.6. The event horizon 
of a black hole is shown in the 
midst of the vacuum 
containing virtual 
particle-antiparticle pairs. In 
some cases one member of 
the pair with negative energy 
(case III) is gobbled up by the 
black hole. The other member 
with positive energy is set free 
and gives the impression that 
the black hole has emitted it. 


the calculations leading to this result; we will simply study the conse¬ 
quences of such a process in the early Universe. Figure 18.6 provides 
a qualitative description of how the Hawking process operates. Not all 
aspects of the Hawking process have been worked out yet. An important 
issue still unresolved, for example, is that of back reaction: how the 
emission of particles by the black hole affects and alters the geometry 
of spacetime outside and what effect this change has on the process of 
radiation by the black hole. 

The idea we shall use here is that a spherical black hole of mass M 
ejects particles in a thermal spectrum of temperature T given by 

kT = —- 10 26 AC\ (18.17) 

%nGM g v ’ 

where M g = M expressed in grams. The emission of particles by the 

black hole as per the rules of blackbody radiation leads to a mass-loss 

rate given by 

d M„ , . 

- ^ - 10 26 M“ 2 s -1 . (18.18) 

d 1 B 

The ~ implies that a numerical constant of the order of 1 appears on 
the right-hand side to take account of the number of particle species 
emitted. If we integrate (18.18) we find that the entire mass of the black 
hole is radiated away in a time r given by 


r - 3 x 10“ 27 M„ 3 s. 


(18.19) 
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Thus a black hole created soon after the big bang with a mass exceeding 
~5 x 10 14 g would just about last until the present day. 

The process described above is slow to start with, when a black hole 
is massive and cold. However, as M decreases T rises and the mass- 
loss rate increases until finally it reaches a catastrophically high level. 
This final stage is often called evaporation or explosion of a black hole. 
As seen above, a stellar-mass black hole (M g > 10 33 ) is hardly likely 
to explode within the lifetime of the Universe! Since the black holes 
considered in various astrophysical scenarios are at least as massive as 
2M 0 , for them the Hawking process is only of academic interest. 

However, it is claimed that there are scenarios in the very early 
Universe that could lead to the formation of primordial black holes 
(PBHs) of masses much lower than M Q . Bernard Carr in 1975 was 
the first to discuss their consequences at length. Carr investigated PBH 
formation and evaporation in order to see whether the currently observed 
nucleon density as well as the microwave background can be explained 
in terms of emission of baryons, leptons, photons and so on by low-mass 
black holes. These concepts are highly speculative, and have not been 
suitably integrated with the other (equally speculative!) scenarios of the 
very early Universe. 

The interesting aspect of this approach is that PBHs act as sources of 
various particles that need somehow to be created in the Universe. The 
suggestion that PBHs evaporating today might account for the observed 
y-ray bursts, however, does not seem to be correct, since the spectrum 
of y-rays emitted by a PBH is not like the spectrum observed in burst 
events. 

There are several loose ends still to be sorted out in the PBH sce¬ 
nario. At the deepest level one has to understand how they can form in 
the first place, since the usual process of gravitational collapse that is 
supposed to lead to stellar or more-massive black holes cannot apply 
here. Next one needs to express the concepts of thermodynamics and 
statistical mechanics in highly curved spacetime in order to give pre¬ 
cise meaning to the notions of temperature and blackbody spectrum: 
the formulae (18.17) and (18.18) merely use a naive extrapolation of 
flat-spacetime thermodynamics. Further, the problem of back reaction 
still remains unresolved. Finally, on the observational front, this bizarre 
concept still awaits a befitting application in the real Universe. 

18.6 Concluding remarks 

This brings us to the end of this chapter as well as this book. We have 
tried to cover the theory of relativity at an elementary level. The present 
chapter gives some glimpses into concepts not covered in the book. 
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Since the 1960s general relativity has added several new feathers to its 
cap, both in applications to observations (e.g., relativistic astrophysics, 
gravitational radiation, gravitational lensing) and purely in theory (e.g., 
spacetime singularity, field quantization in curved spacetime). It has 
inspired further intellectual developments such as the loop theory of 
quantum gravity [96] and string theory [97]. We have kept away from 
both these descriptions since definitive and observable conclusions have 
still to emerge from these very interesting approaches. 

In the end, we close by quoting a short verse by Jack C. Rossetter 
that was inspired in 1950 by the popular belief that general relativity is 
a very difficult-to-understand type of theory: 

To Einstein, hair and violin 
We give our final nod. 

Though understood by just two folks, 

Himself - and sometimes God. 

[From The Mathematics Teacher , 

November 1950, p. 341] 


Most of that mystique round general relativity has by now dissi¬ 
pated and it is seen today as an intellectual achievement par excellence, 
enriching, rather than isolated from, the rest of physics. 


Exercises 

1. In Newtonian gravity an oblate Sun will generate a gravitational potential 


GM 0 


/ R \ 2 

1 - P 2 (cos6>) 


where J is the quadrupole-moment parameter and P 2 is the second Legendre 
polynomial. Show that the orbit of a planet precesses because of the above 
gravitational effect at the rate 3n R^J/1 2 , where / is the semi latus rectum of 
the orbit. Estimate the precession rate for Mercury for J = 2.5 x 10~ 5 . What 
significance does this calculation have for the Brans-Dicke theory? 

2. The Brans-Dicke theory can be re-expressed as a theory in which G — 
constant but the particle masses change with epoch. Show that this is achieved 
by a conformal transformation 


<t> 

gik = -jgtk, <t> = constant. 
<P 


The field equations then become (in the new metric) 


R ik ~ ^gtk R = ->cT tk , 
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where k is constant. Although these look like Einstein’s equations, the T ,contain 
<j> and its derivatives. Show from the new field equations that 


□ ln</> = 


8 nG 

(2o7+3)c4 


f 


with G = constant. This form of the theory was obtained by Dicke in 1962. The 
particle masses in this version vary as 


[~<j> 

m = m\ m = constant. 

V 4 > 

3. Show that the deceleration parameter for the steady-state universe is equal to 
— 1 at all epochs. 

4. Discuss the validity of the following statement: ‘Of the various ways of 
resolving Olbers’ paradox, the only way open to the steady-state model is that 
of the expansion of the Universe’. 

5. Write down the expression for the angle subtended at the observer by a 
spherical cluster of radius R at z = z max in the QSSC. Relate this expression to 
the angular scale of anisotropy of the microwave background in the QSSC. 

6. Explain why the QSSC does not have an Olbers-type problem of darkness of 
the night sky. 

7. Assuming that our Galaxy has been radiating at the rate of 4 x 10 43 erg s _1 
for a time 3 x 10 17 s and that this energy is derived from conversion of hydrogen 
to helium, estimate how much helium is formed in this way. (Energy of 6 x 
10 18 erg g _1 is released when hydrogen is converted to helium.) Comment on 
this answer in relation to the primordial mass fraction of helium obtained in 
Chapter 16. 

8. Compute A g (t) for the closed Friedmann model with given values of q 0 and 
ho, taking the time interval as (0, t) and the spatial extent covering the whole 
(spherical) space. Estimate the epoch at which A g = h. Why do you get an 
answer different from t P ? 

9. Show that, at the Planck epoch, the Schwarzschild radius of a primordial 
black hole just filling the particle horizon is of the same order as the Compton 
wavelength of the black hole. 
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